Job search

Our sectors

Client services

About us

Looking for
tech jobs in the US?

Visit US Tech Recruitment

Client services

At European Recruitment, our sectors cover a wide
range of industries within the field of technology

Submit Vacancy

About us

At European Recruitment, our sectors cover a wide
range of industries within the field of technology

Submit Vacancy

Client services

Learn about what client services we offer at USA Tech Recruit and browse though our success stories.

Submit vacancy
Looking for
tech jobs in Europe?
Visit European Tech Recruit

Looking for
tech jobs in the US?

Visit US Tech Recruitment

Our Sectors

At European Recruitment, our sectors cover a wide range of industries within the field of technology

Submit Vacancy

About us

Learn more about USA Tech Recruit's story, mission and values, meet our team, and read about our commitment to DE&I.

Submit vacancy
Looking for
tech jobs in Europe?
Visit European Tech Recruit

Looking for
tech jobs in the US?

Visit US Tech Recruitment

Our Sectors

At European Recruitment, our sectors cover a wide range of industries within the field of technology

Submit Vacancy

ML Inference Engineer

Recruitment Consultant
Guy Williams
Posted
7 days ago

Every day, billions of critical data points — from financial records to healthcare documents — are trapped in unstructured formats like PDFs and spreadsheets. We’re building the most accurate, scalable system to extract and transform that data into something machines can understand.

In just six months, we’ve gone from zero to seven figures in annual recurring revenue, powering data pipelines for hundreds of leading AI teams — from fast-moving startups to some of the world’s largest enterprises. With over 300 million documents parsed and strong backing from top-tier investors, we’re scaling fast — and looking for world-class engineers to join us.

? What You’ll Work On

As a key engineer on our core team, you’ll:

  • Architect and implement high-performance inference systems to serve state-of-the-art AI models.

  • Optimize model serving infrastructure for speed, scalability, and reliability.

  • Apply advanced optimization techniques to push the limits of inference performance.

  • Collaborate with research to bring cutting-edge AI capabilities into production.

  • Build internal tooling and infrastructure to accelerate experimentation and deployment.

? We’re Looking For Someone Who…

  • Thinks like an owner — you set a high bar, ship fast, and obsess over quality.

  • Builds with intent — you don’t just flag problems, you dive in and fix them.

  • Knows the stack — deep Python and PyTorch experience, with strong systems fundamentals (multi-threading, memory, networking, storage).

  • Has real-world inference chops — experience with tools like vLLM, TGI, TensorRT-LLM, and Optimum, and confident in building custom tools for optimization and testing.

  • Join a rocketship with real traction and serious momentum.

  • Work on technically ambitious problems with a highly talented, no-BS team.

  • Have massive impact early in the company’s journey — your work will directly shape our product and future.

  • ? Why Join?

     

Industry
Contract Type
Permanent
Location
United States
Work Model
On-Site

Apply Now

By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.

For more information, please refer to our
Privacy Notice

    Name
    Email
    Phone
    Location
    Message

    Upload CV:

    Choose file

    Formats: Word, PDF (max. size: 20MB)

    Subscribe for industry highlights.

    Send Application
    Submit CV
    Submit Vacancy
    Cookie Settings
    We use cookies to enhance your experience and analyze site traffic and movements. Read our cookie policy here.