Job Specifications
We're looking for a Senior MLOps Engineer to architect and build our production ML infrastructure from the ground up. You'll be responsible for designing and implementing a multi-tenant platform that enables our data science team to deploy machine learning models at scale across multiple wastewater utility customers. This is a foundational role where you'll establish the patterns, practices, and infrastructure that will support dozens of production models serving critical utility operations.
Key Responsibilities
Design and implement multi-tenant ML model serving infrastructure that supports customer isolation, monitoring, and cost allocation.
Build CI/CD pipelines for automated model training, testing, validation, and deployment.
Establish data quality frameworks including validation, drift detection, and monitoring at scale.
Create model versioning, A/B testing, and rollback capabilities for production deployments.
Collaborate closely with data scientists to establish workflows that enable independent model deployment while maintaining quality and consistency.
Implement observability and monitoring systems for model performance, data quality, and infrastructure health.
Design and document architectural patterns and best practices for the ML platform.
Optimize infrastructure costs across multiple customer deployments.
Ensure security, compliance, and data isolation requirements are met in multi-tenant architecture.
Bridge the gap between pilot/proof-of-concept systems and production-ready infrastructure.
Qualifications
8+ years of experience in MLOps, DevOps, or ML Infrastructure engineering.
Proven experience architecting and building ML platforms from scratch (0→1), not just maintaining existing systems.
Deep understanding of multi-tenant architecture patterns, including data isolation, security, and cost optimization.
Strong experience with containerization (Docker, Kubernetes) and orchestration for ML workloads.
Hands-on experience with at least one major cloud platform (AWS, Azure, or GCP) for production ML deployment.
Experience designing and implementing CI/CD pipelines for ML models.
Strong knowledge of data quality monitoring, model drift detection, and observability practices.
Proficiency in Python and infrastructure-as-code tools (Terraform, CloudFormation, etc.).
Experience working with Python ML Stack: PyTorch, Scikit-learn, NumPy, and Pandas
Experience working closely with data scientists to enable their productivity and independence.
Excellent communication skills - able to explain architectural decisions and tradeoffs to both technical and business stakeholders.
Bonus points for experience in:
Experience in time-series data, SCADA systems, or edge computing.
Previous experience scaling ML systems from pilots to hundreds of production deployments.
Familiarity with water/wastewater utility operations or industrial control systems.
About the Company
Rivago Infotech Inc has been a leader in IT staffing and Software development for over 5 years and is one of the largest diversity and development firms in the industry. We are known for our high-touch, customer-eccentric approach, offering our clients unmatched quality, responsiveness and flexibility . We are appreciated by our clients for our streamlined execution, highly efficient service and exceptional talent management that go above and beyond traditional staffing services.
Know more