- Company Name
- iBrain Technologies, Inc
- Job Title
- Data Scientist
- Job Description
-
Job Title: Data Scientist
Role Summary: Design, develop, and deploy large‑scale AI solutions with a focus on Retrieval‑Augmented Generation (RAG) and advanced language models. Lead end‑to‑end model lifecycle from research to production in a multi‑repo, CI/CD environment, ensuring high‑quality, scalable deployments on cloud platforms.
Expectations:
- Demonstrated success in AI/ML research and production at senior level (8‑10+ years).
- Proficient architect of RAG systems, LLM integration, and semantic search pipelines.
- Proven ability to optimize prompts, mitigate hallucinations, and engineer reinforcement learning workflows.
- Leadership in cross‑functional Agile teams, delivering robust APIs and automated deployments.
Key Responsibilities:
- Build and fine‑tune Transformer‑based models (BERT, RoBERTa, T5, OPT, BLOOM, GPT‑3.5, ChatGPT) for conversational and semantic search.
- Design and maintain RAG pipelines: embedding generation, knowledge‑base indexing, document retrieval, and LLM‑augmented responses.
- Create and manage CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI) for model training, validation, and production deployment on AWS (primary) and Azure.
- Collaborate with product, data engineering, and DevOps to translate business requirements into scalable ML solutions.
- Conduct code reviews, enforce best practices, and mentor junior engineers.
- Monitor model performance, address bias, hallucination, and prompt drift; iterate quickly in production.
Required Skills:
- Master’s or Ph.D. in CS, AI, ML, NLP, or related field.
- 8‑10+ years of AI/ML experience; advanced coding in Python.
- 2+ years with LLMs (ChatGPT, GPT‑3.5, OPT, BLOOM) and RAG implementation.
- Deep knowledge of Transformer architectures (BERT, RoBERTa, T5).
- Expertise in conversational/semantic search, prompt engineering, few‑shot/zero‑shot techniques, chain‑of‑thought, and hallucination mitigation.
- Strong DevOps: version control, CI/CD, containerization, API design, multi‑repo orchestration.
- Experience deploying models in AWS (primary) and Azure.
- Familiarity with Jenkins, GitHub Actions, GitLab CI, code review tools.
- Agile teamwork, stakeholder communication, and documentation skills.
Required Education & Certifications:
- Master’s degree (preferred) or Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or related discipline.
- Relevant certifications in cloud platforms (AWS Certified Machine Learning – Specialty, Azure AI Engineer) are a plus.