cover image
Oxford Data Plan

AI Systems Engineer

Remote

United kingdom

Mid level

Full Time

09-03-2026

Share this job:

Skills

Python GitHub CI/CD Docker Kubernetes Monitoring Risk Mitigation Regression Programming Databases react AWS Analytics Flask FastAPI Data Science Langchain CI/CD Pipelines Microservices GitHub Actions

Job Specifications

Oxford Data Plan is a fast-growing FinTech company providing alternative data and KPI tracking for 200+ listed companies worldwide. We help fundamental investors make better decisions with proprietary data insights. Founded in 2022, we've grown to over 70 people and are backed by leading investors.

We are hiring an AI Systems Engineer to design, build, and deploy end-to-end AI systems across the organisation—ranging from client-facing AI products to internal tools supporting data science, product, engineering, and revenue teams—on top of robust, scalable AWS infrastructure.

This is a hands-on role spanning AI system design, agent architectures, LLM engineering, cloud deployment, and AIOps/MLOps for production reliability. You will work on LLM applications, agentic workflows, RAG systems, ML pipelines, analytics automation, and microservices.

Key Responsibilities

Design, build, and operate end-to-end AI/LLM systems, including chatbots, analytics assistants, automation tools, and decision-support services.
Develop internal productivity and intelligence tools that accelerate workflows across data science, product, engineering, and revenue teams.
Build autonomous AI agents and workflow orchestrators using frameworks such as LangChain, CrewAI, ADK, or equivalent systems.
Design and implement LLM-backed microservices (FastAPI/Flask) for summarisation, intelligence, forecasting, data extraction, and API-driven reasoning.
Build and operate full Retrieval-Augmented Generation (RAG) pipelines: ingestion → chunking → embeddings → indexing → retrieval → LLM reasoning.
Optimise retrieval quality using metadata, hybrid search, chunking strategies, rerankers, and relevance tuning.
Implement document classification, NER, entity extraction, and knowledge-graph-driven retrieval where appropriate.
Establish reliability, safety, and governance guardrails across AI systems, including monitoring, error handling, tool-selection controls, and risk mitigation.
Instrument, monitor, and evaluate AI and RAG systems using logging, metrics, tracing, agent telemetry, quality benchmarks, hallucination testing, and regression tests.
Deploy and operate AI agents and LLM microservices on AWS (Bedrock, Lambda, ECS/EKS, API Gateway, S3, Secrets Manager, CloudWatch).
Build and maintain production CI/CD pipelines (GitHub Actions), manage model/version lifecycles, and support retraining and automated evaluation workflows.

Required Skills & Qualifications

Key Skills

Strong software engineering background with expertise in Python, including modular design, async programming, and modern development practices.
Experience designing and building APIs and microservices (FastAPI / Flask) for production systems.
Hands-on experience building and operating production LLM systems, including agentic workflows and RAG pipelines.
Experience designing and operating RAG systems, including vector databases and retrieval pipelines.
Experience designing and running LLM evaluations, including task-level metrics, hallucination testing, regression benchmarks, and golden datasets.
Hands-on familiarity with one or more of LLM observability and evaluation tooling such as OpenTelemetry, LangSmith, Weights & Biases, Arize/Phoenix, or equivalent in-house systems.
Experience deploying and operating AI systems on AWS (Bedrock, EC2, Lambda, ECS/EKS, API Gateway, S3, CloudWatch), with a focus on reliability, security, and cost-aware production usage.

Nice-to-Haves

Familiarity with Docker, Kubernetes, CI/CD, and continuous deployment in production environments.
Experience with search and retrieval systems such as AWS Kendra, OpenSearch, Weaviate, Qdrant, or Pinecone.
Ability to build simple internal-facing UIs or tools (React, Streamlit).
Experience building reusable SDKs, internal AI platforms, or shared developer frameworks.

Who You Are

You have 5+ years of overall experience in software engineering/ML engineering, with at least 2 years building GenAI systems in production.
You ship real production systems, not just prototypes.
You operate at the intersection of AI, engineering, and operations.
You think in systems: reliability, observability, cost, and scale.
You work independently, own problems end-to-end, and simplify complexity.
You prioritise safety, interpretability, and security in every AI system you build.

Please include a link to your GitHub/portfolio and examples of AI agents, RAG systems, or ML pipeline you have built and deployed.

About the Company

Oxford Data Plan delivers institutional-grade alternative data and daily KPI estimates for 250+ global equities, enabling hedge funds and asset managers to identify inflections ahead of consensus. We combine proprietary and exclusive datasets—including a global receipt panel, exclusive advertising agency partnerships, —with multi-signal modeling to produce point-in-time estimates across 500+ KPIs. Our coverage spans TMT, consumer, financials, and real economy sectors, with daily delivery designed for systematic and fundamen... Know more