cover image
Fractile

Sr ML Runtime Engineer

On site

London, United kingdom

Mid level

Full Time

18-11-2025

Share this job:

Skills

Python Rust Machine Learning PyTorch

Job Specifications

At Fractile, we’re taking a revolutionary approach to computing to run the world’s largest language models 100x faster than existing systems. Our fast-growing team is working at the cutting edge of the latest AI developments in both hardware and software. Want to get involved?

We are looking for Senior ML Runtime Engineers with experience of key ML software ecosystem components to work on the runtime stack of our ground-breaking AI accelerators. You can wbe based in either our London office or Bristol, the choice is yours.

In This Role, You Will

Develop the Rust runtime for Fractile’s innovative AI acceleration hardware
Integrate that runtime with key open source projects like PyTorch, vLLM, and SGLang
Work with hardware, lower-level software, and ML engineers in a highly collaborative hardware-software co-design methodology

It Would Be Great If You Have

Proven experience of working with major ML software ecosystem projects
A good understanding of the latest ML workloads and inference deployment challenges
Excellent Rust and Python skills and solid experience of industry standard development tools and technologies
A creative and innovative mindset, and a willingness to take ownership and drive results in a fast-paced environment
Computer Science, Electronic Engineering, Maths, Physics, or related degree and 5+ years of industry experience

You May Also Have

Experience of working with GPUs or other machine learning accelerators
Previous experience in a startup or small team environment

About the Company

Fractile is building chips to run large language models two orders of magnitude faster. Existing hardware is good for training LLMs, but very poorly suited to subsequent inference of the trained model, which is increasingly the dominant workload. A network’s weights need to be moved onto a chip once per word generated, and this movement takes a few hundred times longer than the subsequent computations themselves. Fractile’s revolutionary approach to fusing computation with memory eliminates this bottleneck, and can scale to ... Know more