Job Specifications
About Misti AI
Misti AI is building real-time Vision AI Agents that keep workers safe in mining and heavy industry. Our system monitors high-risk operations using computer vision, distributed edge computation, and cloud orchestration; preventing accidents and delivering immediate, measurable impact in the physical world.
We are operating at the frontier where AI meets the harshest industrial environments, deploying real-time inference across unreliable networks, extreme conditions, and mission-critical workflows.
We are scaling rapidly across the world, with engineering leadership based in London.
If you want to work on AI that actually touches reality, this is the place.
Role: Cloud & Edge AI Systems Engineer
This role owns the production reliability, scalability, security, and performance of our Vision Agentic Platform across both AWS and Nvidia Jetson devices. This is core systems engineering for AI — not academic ML research.
You will build the infrastructure that allows our computer vision models to run in real-time on edge devices, coordinate with the cloud, and deliver safety-critical insights in operational mines. Preferably based in London.
Open to exceptional candidates worldwide (remote).
Experience:4+ years of experience in relevant systems/infrastructure engineering roles is preferred.
What You’ll Be Doing
1. Cloud Architecture & Distributed Systems (AWS-first)
You will design and own the cloud backbone that powers our AI agents:
Architect and scale AWS services (EKS/ECS, networking, autoscaling, serverless components).
Build real-time ingestion pipelines for video and event streams from multiple mining sites.
Use AWS SageMaker and model registries to manage deployment-ready models.
Implement Infrastructure as Code (Terraform/CloudFormation).
Ensure reliability across unstable networks and high-throughput video workloads.
2. Edge AI Deployment on Nvidia Jetson
You’ll bring our models to life at the edge:
Package and optimize CV models with ONNX/TensorRT for low-latency inference.
Manage secure OTA updates for remote Jetson fleets in harsh industrial settings.
Develop telemetry, logging, and device health monitoring for distributed heterogeneous devices.
Handle bandwidth-constrained, intermittently connected environments.
3. CI/CD, Observability & Runtime Reliability
You’ll ensure the system works end-to-end:
Build automated CI/CD pipelines (GitHub Actions) for both cloud and edge deployments.
Implement robust observability: metrics, logs, tracing, alerting, and incident response.
Debug across containers, networks, devices, and cloud systems under real operational pressure.
Ensure systematic documentation of architecture, deployment flows, and security controls.
4. Collaborate With ML Engineers
Not research; productionizing and maintaining deployed models:
Deploy and operate models in live environments using SageMaker registries or similar tools.
Implement monitoring for model drift, latency, failure modes, and per-device performance.
Manage large fleets of distributed devices with consistent logging, updates, and inference quality.
Shape decisions around model deployment strategy and hardware constraints.
5. Security, Compliance & Data Protection
You will protect sensitive CCTV and worker footage:
Implement encryption at rest and in transit using AWS KMS.
Define and enforce IAM access controls and least-privilege boundaries.
Set up AWS Security Hub, CloudTrail, and GuardDuty for continuous monitoring.
Handle data retention and compliance requirements (e.g., GDPR or local equivalents).
Ensure edge-to-cloud integrity, secure updates, and auditability.
What We’re Looking For
Core Hard Skills
Deep experience with AWS (EKS/ECS, networking, IAM, KMS, autoscaling, security).
Strong background in distributed systems or backend infrastructure at scale.
Proven experience deploying models to edge devices (Jetson strongly preferred).
Strong CI/CD expertise with automated pipelines (GitHub Actions or similar).
Hands-on IaC experience (Terraform or CloudFormation).
Strong Python and/or Go/Node fundamentals; Linux-native mindset.
Nice to Have
Experience with ONNX, TensorRT, model optimization, or quantization.
Exposure to AWS IoT Greengrass or similar edge orchestration stacks.
Experience with safety-critical, robotics, or embedded real-time systems.
Prior work with GDPR/PII-sensitive environments or regulated industries.
GPU-aware systems engineering.
Soft Skills (Startup-Critical)
Bias to action: you build, iterate, debug, and ship.
Resourcefulness: comfortable navigating ambiguous problems across cloud + edge.
Clarity: excellent documentation and communication of architecture and decisions.
Resilience: comfortable owning infrastructure in a high-stakes, fast-moving environment.
Why Join Misti AI
You will build AI that touches the real world
Not dashboards or theoretical models, vision agents deployed in mines, preventing accidents.
You will own the system end-to-end
Fro