- Company Name
- Apple Inc.
- Job Title
- AIML - Ingenieur en apprentissage automatique, MLPT F/H
- Job Description
-
**Job Title:** AIML – Machine Learning Engineer (MLPT)
**Role Summary:**
Collaborate with research teams to optimize inference performance of cutting‑edge model architectures, partner with product groups to deliver real‑time ML solutions to millions of users, build tools to identify and resolve inference bottlenecks across hardware platforms, and provide technical mentorship to engineering staff.
**Expactations:**
- Lead and drive complex, ambiguous AI/ML projects to production.
- Deliver high‑throughput services at super‑computer scale.
- Ensure cloud‑native deployment and operational excellence.
- Mentor and guide engineers across the organization.
**Key Responsibilities:**
- Optimize inference pipelines for state‑of‑the‑art models (e.g., Transformers, encoder/decoder).
- Design, implement, and launch real‑time model serving solutions for large‑scale consumer applications.
- Develop diagnostic tools to pinpoint inference bottlenecks on various hardware (GPU, CPU, accelerator).
- Architect and maintain cloud deployments using Kubernetes, Docker, and major cloud providers (AWS, Azure, etc.).
- Write and maintain production‑grade code in modern languages (Go, Python).
- Create custom CUDA kernels and utilize frameworks such as NVIDIA TensorRT‑LLM, vLLM, DeepSpeed, and NVIDIA Triton Inference Server.
- Provide technical leadership, code reviews, and coaching for engineering teams.
**Required Skills:**
- Proven experience managing complex, ambiguous AI projects.
- Expertise in high‑throughput services, including super‑computer environments.
- Strong command of cloud execution (AWS, Azure, or equivalent) with Kubernetes & Docker.
- Proficient in GPU programming (CUDA) and major ML frameworks (PyTorch, TensorFlow).
- Solid software engineering skills in Go, Python, or comparable modern languages.
- Deep understanding of deep learning architectures (Transformers, encoder/decoder).
- Hands‑on experience with NVIDIA TensorRT‑LLM, vLLM, DeepSpeed, NVIDIA Triton Inference Server.
- Ability to write custom CUDA kernels or OpenAI Triton kernels.
**Required Education & Certifications:**
- Bachelor’s (minimum) or Master’s degree in Computer Science, Electrical Engineering, Mathematics, or a related technical field.
- Advanced degrees (Ph.D.) or specialized certifications in Machine Learning, Cloud Architecture, or High‑Performance Computing are a plus.
- Relevant industry certifications (e.g., AWS Certified Solutions Architect, Google Cloud Professional ML Engineer) are advantageous.