Together AI

4 Jobs

247 Employees

About the Company

Together AI is a research-driven AI cloud infrastructure provider. Our purpose-built GPU cloud platform empowers AI engineers and researchers to train, fine-tune, and run frontier class AI models. Our customers include leading SaaS companies such as Salesforce, Zoom, and Zomato, as well as pioneering AI startups like ElevenLabs, Hedra, and Cartesia. We advocate for open source AI and believe that transparent AI systems will drive innovation and create the best outcomes for society.

Listed Jobs

Company Name: Together AI
Job Title: Machine Learning Operations (MLOps) Engineer
Job Description: **Job Title:** Machine Learning Operations (MLOps) Engineer **Role Summary:** Design, develop, and maintain production‑grade ML inference and fine‑tuning pipelines for large language models (LLMs). Deliver scalable, automated systems that enable rapid deployment, evaluation, and operation of AI services for customers and internal teams. **Expectations:** - 5+ years of professional experience building ML training or inference systems at scale. - Demonstrated expertise in deploying LLMs and optimizing their runtime performance. - Proven knowledge of CI/CD, containerization, Kubernetes, and cloud infrastructure. **Key Responsibilities:** - Collaborate with engineering, research, and sales to deploy and operate inference pipelines for customers and internal use. - Build, maintain, and document tools, services, and automation workflows for testing, monitoring, and scaling ML workloads. - Analyze system performance, identify bottlenecks, and implement improvements to efficiency, reliability, and cost‑effectiveness. - Conduct design and code reviews to uphold code quality and best practices. - Participate in on‑call rotation to respond to production incidents and outages. **Required Skills:** - Strong background in machine learning with a focus on state‑of‑the‑art LLMs. - Proficiency in Python and at least one additional language (e.g., Go). - Experience with ML frameworks such as TensorFlow, PyTorch, or Scikit‑learn. - Deep familiarity with DevOps practices: CI/CD pipelines, automated testing, Docker containerization, Kubernetes orchestration. - Competence in cloud platforms: AWS, Google Cloud Platform, or Microsoft Azure. - Ability to design, implement, and document robust production‑grade APIs and services. **Required Education & Certifications:** - Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent industry experience. - Valid certifications in cloud platforms (AWS, GCP, or Azure) are a plus but not mandatory.

San francisco, United states

On site

Mid level

12-11-2025

Company Name: Together AI
Job Title: Machine Learning Engineer - Inference
Job Description: **Job Title:** Machine Learning Engineer - Inference **Role Summary:** Design and optimize high-performance AI inference systems for large language models, collaborating with researchers and engineers to deliver scalable, production-ready solutions. **Expectations:** - 3+ years of production-quality code experience. - Proficiency in Python, PyTorch, and high-performance system design. - Strong understanding of low-level OS concepts (threading, memory, networking). **Key Responsibilities:** - Develop and optimize AI inference engine systems for reliability and scalability. - Build runtime services for large-scale AI applications. - Collaborate with cross-functional teams to implement research into production features. - Conduct design/code reviews to maintain high quality standards. - Create tools, documentation, and infrastructure for data ingestion and processing. **Required Skills:** - Python, PyTorch, and high-performance library/tooling development. - Low-level OS expertise: multi-threading, memory management, networking. - Prior experience with AI inference systems (e.g., TGI, vLLM) preferred. - Knowledge of inference techniques (e.g., speculative decoding) and CUDA/Triton programming. - Familiarity with Rust, Cython, or compilers a bonus. **Required Education & Certifications:** Not specified.

San francisco, United states

Hybrid

Junior

14-12-2025

Company Name: Together AI
Job Title: LLM Inference Frameworks and Optimization Engineer
Job Description: **Job Title** LLM Inference Frameworks and Optimization Engineer **Role Summary** Design, develop, and optimize large‑scale, low‑latency inference engines for text, image, and multimodal models. Focus on distributed parallelism, GPU/accelerator efficiency, and software‑hardware co‑design to deliver high‑throughput, fault‑tolerant AI deployment. **Expectations** - Lead end‑to‑end development of inference pipelines for LLMs and vision models at scale. - Demonstrate measurable improvements in latency, throughput, or cost per inference. - Collaborate cross‑functionally with hardware, research, and infrastructure teams. - Deliver production‑ready, maintainable code in Python/C++ with CUDA. - Communicate technical trade‑offs to stakeholders. **Key Responsibilities** - Build fault‑tolerant, high‑concurrency distributed inference engines for multimodal generation. - Engineer parallelism strategies (Mixture of Experts, tensor, pipeline parallelism). - Apply CUDA graph, TensorRT/TRT‑LLM, and PyTorch compilation (torch.compile) optimizations. - Perform cache system tuning (e.g., Mooncake, PagedAttention). - Conduct performance bottleneck analysis and co‑optimize GPU/TPU/custom accelerator workloads. - Integrate model execution plans into end‑to‑end serving pipelines. - Maintain code quality, documentation, and automated testing. **Required Skills** - 3+ years deep‑learning inference, distributed systems, or HPC experience. - Proficient in Python & C++/CUDA; familiarity with GPU programming (CUDA/Triton/TensorRT). - Deep knowledge of transformer, large‑language, vision, and diffusion model optimization. - Experience with LLM inference frameworks (TensorRT‑LLM, vLLM, SGLang, TGI). - Knowledge of model quantization, KV cache systems, and distributed scheduling. - Strong analytical, problem‑solving, and performance‑driven mindset. - Excellent collaboration and communication skills. **Nice‑to‑Have** - RDMA/RoCE, distributed filesystems (HDFS, Ceph), Kubernetes experience. - Contributions to open‑source inference projects. **Required Education & Certifications** - Bachelor’s degree (or higher) in Computer Science, Electrical Engineering, or related field. - Certifications in GPU programming or distributed systems are a plus.

San francisco, United states

On site

Junior

14-12-2025

Company Name: Together AI
Job Title: Security Engineer Intern (Summer 2026)
Job Description: Job Title: Security Engineer Intern Role Summary: Develop and implement secure AI systems by designing enterprise-wide security solutions, building AI-driven security models, and collaborating with IT teams to enforce IAM best practices. Focus on safeguarding corporate assets through proactive and reactive security measures. Expectations: Write maintainable code, lead IAM policy implementation, and contribute to AI-assisted security applications to enhance threat detection and response. Key Responsibilities: - Design and deploy security controls to protect AI infrastructure - Develop clean, efficient code for security tools and automation - Collaborate with IT to establish identity and access management (IAM) policies - Build AI models for data classification and security operations - Support cross-functional teams in maintaining security standards Required Skills: - Proficiency in Python or bash - Experience with AI-assisted application development - Strong understanding of security frameworks and threat mitigation Required Education & Certifications: Bachelor’s degree (or equivalent) in Computer Science, Software Engineering, or related field, with graduation by Summer 2027. No certifications required.

San francisco, United states

Hybrid

Fresher

06-01-2026