ML Inference Engineer

Recruitment Consultant

Guy Williams

Contact Details

gw@eu-recruit.com +44 (0) 330 052 6078

Posted

30 days ago

Every day, billions of critical data points — from financial records to healthcare documents — are trapped in unstructured formats like PDFs and spreadsheets. We’re building the most accurate, scalable system to extract and transform that data into something machines can understand.

In just six months, we’ve gone from zero to seven figures in annual recurring revenue, powering data pipelines for hundreds of leading AI teams — from fast-moving startups to some of the world’s largest enterprises. With over 300 million documents parsed and strong backing from top-tier investors, we’re scaling fast — and looking for world-class engineers to join us.

? What You’ll Work On

As a key engineer on our core team, you’ll:

Architect and implement high-performance inference systems to serve state-of-the-art AI models.
Optimize model serving infrastructure for speed, scalability, and reliability.
Apply advanced optimization techniques to push the limits of inference performance.
Collaborate with research to bring cutting-edge AI capabilities into production.
Build internal tooling and infrastructure to accelerate experimentation and deployment.

? We’re Looking For Someone Who…

Thinks like an owner — you set a high bar, ship fast, and obsess over quality.
Builds with intent — you don’t just flag problems, you dive in and fix them.
Knows the stack — deep Python and PyTorch experience, with strong systems fundamentals (multi-threading, memory, networking, storage).
Has real-world inference chops — experience with tools like vLLM, TGI, TensorRT-LLM, and Optimum, and confident in building custom tools for optimization and testing.
Join a rocketship with real traction and serious momentum.
Work on technically ambitious problems with a highly talented, no-BS team.
Have massive impact early in the company’s journey — your work will directly shape our product and future.
? Why Join?