- Company Name
- Shields Group Search
- Job Title
- Machine Learning Engineer
- Job Description
-
**Job Title**
Machine Learning Engineer – LLM Interpretability & Systems
**Role Summary**
Build production‑grade systems that apply mechanistic interpretability, activation and representation engineering to open‑source large language models (LLMs). Translate cutting‑edge research into scalable code, perform model surgery (control vectors, activation patching, feature extraction), and deploy optimized models that match or exceed proprietary competitors.
**Expectations**
- Deliver end‑to‑end pipelines that improve model capability for defined use‑cases.
- Demonstrate measurable performance uplift and cost efficiency on open‑weight LLMs.
- Actively identify bottlenecks, propose technical solutions, and execute them independently.
- Maintain clean, well‑documented, CI/CD‑ready codebases.
- Communicate progress and findings clearly to engineering and product teams.
**Key Responsibilities**
- Implement research from papers on mechanistic interpretability, activation engineering, and representation engineering.
- Apply model surgery techniques (control vectors, activation patching, feature extraction) to modify LLM internal representations.
- Optimize open‑source LLMs for speed, memory, and accuracy, producing benchmark‑ready builds.
- Design and maintain the infrastructure for deploying modified models (model servers, inference pipelines, monitoring).
- Collaborate with ML research, data engineering, and production teams to integrate solutions into the product roadmap.
- Iterate on experiments, analyze results, and refine techniques in a rapid development cycle.
**Required Skills**
- Deep theoretical understanding of Transformer architectures, attention mechanisms, and probabilistic modeling.
- Proficiency in PyTorch (model internals, custom autograd, distributed training).
- Experience training, fine‑tuning, and dissecting large‑scale LLMs (e.g., GPT‑2/3, LLaMA, Falcon, etc.).
- Strong programming in Python; familiarity with C++/CUDA optional for low‑level optimization.
- Knowledge of XAI techniques and ability to implement control vectors, activation patching, and feature extraction.
- Experience building production‑ready ML systems (containerization, CI/CD, monitoring, scalability).
- Self‑starter mindset with proven ability to detect and solve complex technical challenges autonomously.
- Excellent communication and documentation skills for internal and external stakeholders.
**Required Education & Certifications**
- Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Applied Mathematics, or related field.
- Additional certifications in deep learning or data science (e.g., DeepLearning.ai, TensorFlow Developer) are a plus but not mandatory.
---
San francisco, United states
On site
18-03-2026