- Company Name
- Tether.io
- Job Title
- AI Research Engineer (Model Evaluation)
- Job Description
-
**Job Title:**
AI Research Engineer (Model Evaluation)
**Role Summary:**
Design, build, and maintain evaluation frameworks for AI models across pre‑training, fine‑tuning, and inference stages. Define essential performance metrics, curate benchmark datasets, develop automated pipelines, and produce dashboards that inform product, engineering, and operations teams, ensuring models meet accuracy, latency, throughput, and memory requirements at scale.
**Expectations:**
- Conduct research‑driven evaluations of advanced model architectures, including multimodal and resource‑efficient variants.
- Translate evaluation findings into actionable recommendations for deployment and optimization.
- Communicate results clearly to technical and non‑technical stakeholders and influence model development lifecycles.
**Key Responsibilities:**
- Develop end‑to‑end evaluation pipelines for pre‑training, post‑training, and inference.
- Define and track key metrics: accuracy, loss, latency, throughput, memory footprint.
- Curate high‑quality evaluation datasets and establish standardized benchmarks aligned with business goals.
- Build automated CI/CD pipelines and integrate monitoring dashboards (Grafana, Kibana, Tableau, Looker).
- Analyze evaluation data, diagnose performance bottlenecks, and propose optimizations.
- Collaborate with product, engineering, data science, and operations to align metrics and incorporate stakeholder feedback.
- Produce regular reports, visualizations, and presentations for cross‑functional teams.
- Stay abreast of industry benchmarks and evaluation methodologies.
**Required Skills:**
- Deep understanding of ML research, large‑scale language and multimodal models, transformer architectures.
- Expertise in pre‑training, fine‑tuning, and inference evaluation; knowledge of efficient inference techniques.
- Proficient in Python, PyTorch/TensorFlow/JAX; experience with evaluation libraries (MLPerf, HuggingFace Eval).
- Strong data pipeline experience (Spark, Databricks, cloud data services).
- Advanced profiling skills (latency, throughput, memory).
- Dashboard development (Grafana, Kibana, Tableau, Looker) and API integration.
- Excellent analytical, problem‑solving, and communication skills.
- Ability to work cross‑functionally and present technical insights to non‑technical audiences.
**Required Education & Certifications:**
- Bachelor’s or higher degree in Computer Science, Electrical Engineering, Data Science, or related field.
- Master’s or PhD preferred.
- Machine‑learning certification (AWS ML, GCP ML, Azure ML, or equivalent) is a plus.