Skills

Conflict Resolution Python Java JavaScript C# Go Rust TypeScript Bash SQL Data Engineering DevOps Docker Research Attention to detail Training Linux git node.js C++ JS

Job Specifications

Job Role : LLM - Full Stack Python + JS

Years of Experience : 6 to 7 years

Skill : Python, JavaScript / Node.js, TypeScript

Role Overview:

You’ll write and debug production-quality code, design rigorous evaluations, and build reproducible workflows that generate clean, high-signal data for model training. Attention to detail matters deeply here—small mistakes can cascade into misleading results, so precision and thoroughness are essential. You’ll also collaborate closely with engineers, researchers, and quality owners to align on standards, review work, and continuously raise the quality bar. If you enjoy solving unusual technical problems, investigating subtle model failures, and working in developer-like environments where correctness, reproducibility, and collaboration matter, this role will keep you very entertained.

What does your day-to-day look like:

Write, review, and debug code across multiple languages.
Design tasks and evaluation scenarios for coding, reasoning, and debugging
Investigate LLM outputs and identify hallucinations, regressions, and failure modes.
Build reproducible dev environments using Docker + automation tools.
Develop scripts, pipelines, and tools for data generation, scoring, and validation.
Produce structured annotations, judgments, and high-quality datasets.
Run systematic evaluations that help improve model reliability and reasoning.

Required Skills :

Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer) Strong hands-on coding experience (professional or research-based) in one or more of:
Python, JavaScript / Node.js, TypeScript (Additional languages like Go, Java, C++, C#, Rust, SQL, R, Dart, etc. are a plus)Solid experience with Linux + Bash, scripting, and automation.
Strong with Docker, reproducible environments, and dev containers.
Advanced Git skills (branching, diffs, patches, conflict resolution) Solid understanding of testing and QA (unit, integration, negative, edge-case focused)
Ability to reliably overlap with 8am–12pm PT.

Nice-to-Haves:

Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer) Experience with dataset creation, annotation, evaluation, or ML pipelines.
Familiarity with benchmarks like SWE Bench or Terminal Bench.
Background in QA automation, DevOps, ML systems, or data engineering.

Who Thrives Here:

Engineers who enjoy breaking things and understanding why.
People who like designing tasks, running experiments, and debugging.
Detail-oriented folks who can spot subtle issues in code or model behavior.
Engineers who like building clean, reusable workflows rather than one-off hacks.

About the Company

Sourcebae is an AI-driven recruitment engine designed to hire top global talent. With its end-to-end hiring model--sourcing, vetting, hiring, and managing--Sourcebae is the ultimate AI-powered, all-in-one hiring platform for businesses of all sizes. Our product includes- AI interviewer Global talent pool Management and compliances Global capability centre Why Choose Sourcebae? 1) Efficient AI interview process: Sourcebae revolutionizes the interview process with AI interviewer, ensuring a smooth and efficient experience f... Know more