- Company Name
- Wayve
- Job Title
- Research Scientist Intern, Embodied Foundation Models (Data, Modeling, & Reasoning)
- Job Description
-
Job Title: Research Scientist Intern, Embodied Foundation Models (Data, Modeling & Reasoning)
Role Summary:
A 3‑ to 6‑month internship focused on advancing embodied AI foundation models. The role involves large‑scale multimodal pre‑training, distributed training, and developing reasoning capabilities for vision‑language systems used in autonomous driving research. Interns lead research projects, evaluate model performance, and prepare publications for top AI conferences.
Expectations:
- Deliver research that can result in publications at venues such as CVPR, ICCV, NeurIPS, CoRL, CoLM, RSS, ICRA.
- Design, implement, and evaluate multimodal foundation models for embodied AI.
- Utilize multi‑node, distributed training pipelines to scale experiments efficiently.
- Collaborate with applied scientists, ML engineers, and software engineers to integrate models into broader systems.
Key Responsibilities:
- Design and run large‑scale pretraining of vision‑language and language‑only models on distributed GPU clusters.
- Optimize training pipelines for speed and memory efficiency, including data loading, mixed‑precision, and checkpointing.
- Implement and test new reasoning mechanisms for embodied AI tasks.
- Benchmark models on open and proprietary datasets, analyze results, and provide actionable insights.
- Prepare manuscript drafts, technical reports, and present findings at internal meetings and external conferences.
- Maintain clean, modular codebases in Python, following best practices for version control and documentation.
Required Skills:
- Proven experience with vision‑language models, large language models, and NLP reasoning.
- Strong programming in Python and familiarity with at least one back‑end or systems language (e.g., Ruby, Java).
- Proficiency with deep learning frameworks: PyTorch, TensorFlow, or JAX.
- Hands‑on experience with multi‑node, distributed training of large models (e.g., Horovod, PyTorch Distributed, DeepSpeed).
- Ability to manipulate large multimodal datasets (vision, language, sensor data).
- Prior publications in peer‑reviewed AI/robotics conferences (CVPR, ICCV, NeurIPS, CoRL, CoLM, RSS, ICRA etc.) is strongly preferred.
- Excellent written and verbal communication for technical writing and presentations.
Required Education & Certifications:
- Currently enrolled in a graduate program (M.S. or Ph.D.) in Computer Science, Machine Learning, Robotics, or a closely related technical field.
- No specific certifications required.