- Company Name
- Chemify Limited
- Job Title
- Senior Data Scientist - AI/ML (CADD)
- Job Description
-
**Job Title**
Senior Data Scientist – AI/ML (CADD)
**Role Summary**
Design, develop, and deploy state‑of‑the‑art machine learning models for computer‑aided drug design. Translate cutting‑edge AI research into production‑ready solutions for property prediction, reaction and binding affinity modeling, and generative chemistry. Lead end‑to‑end AI projects, build scalable MLOps pipelines, and collaborate with chemists, medicinal chemists, and engineers to accelerate drug discovery.
**Expectations**
- 5+ years of industry or academic experience in computer science, machine learning, or computational chemistry/biology.
- Proven track record of moving complex AI/ML solutions from research to deployment.
- Strong leadership, communication, and teamwork skills in multidisciplinary settings.
**Key Responsibilities**
- Architect and implement generative models (Transformers, GNNs, Diffusion Models) for synthetic route planning and molecule generation.
- Build scalable MLOps pipelines for preprocessing large chemical/biological datasets, training, evaluation, and monitoring.
- Translate research advances into practical solutions for ADMET/QSAR, reaction prediction, and binding affinity.
- Design experiments to assess model chemical validity, novelty, synthesizability, and predictive accuracy.
- Communicate model insights and strategy to technical and non‑technical stakeholders.
- Stay current with AI for drug discovery, foundation models, and multimodal learning; identify opportunities to enhance the platform.
**Required Skills**
- Python + PyTorch/TensorFlow; deep knowledge of Transformers, GNNs, VAEs, GANs, Diffusion Models.
- Experience with large‑scale molecular datasets (SMILES, 3D conformers) and biological data (protein sequences, assay data).
- Model optimization techniques: LoRA, quantization, distillation, pruning.
- GPU‑accelerated, distributed training; scalable computing environments.
- Version control with Git; strong communication with cross‑functional teams.
**Required Education & Certifications**
- MSc or PhD in Computer Science, Machine Learning, Computational Chemistry/Biology, or related field.
- No specific certifications required but relevant ML/GNN coursework preferred.