OUR SECTORS

At USA Tech Recruit, our sectors cover a wide range of industries within the field of technology.

Submit vacancy
Looking for tech jobs in Europe?
Visit European Tech Recruit
Looking for tech jobs globally?
Visit Tech Recruit

Job search

Our sectors

Client services

About us

Looking for tech jobs in Europe?
Visit European Tech Recruit
Looking for tech jobs globally?
Visit Tech Recruit

Client services

At European Recruitment, our sectors cover a wide
range of industries within the field of technology

Submit Vacancy

About us

At European Recruitment, our sectors cover a wide
range of industries within the field of technology

Submit Vacancy

Client services

Learn about what client services we offer at USA Tech Recruit and browse though our success stories.

Submit vacancy
Looking for tech jobs in Europe?
Visit European Tech Recruit
Looking for tech jobs globally?
Visit Tech Recruit
Looking for tech jobs in Europe?
Visit Tech Recruit
Looking for tech jobs globally?
Visit Tech Recruit

Our Sectors

At European Recruitment, our sectors cover a wide range of industries within the field of technology

Submit Vacancy

About us

Learn more about USA Tech Recruit's story, mission and values, meet our team, and read about our commitment to DE&I.

Submit vacancy
Looking for tech jobs in Europe?
Visit European Tech Recruit
Looking for tech jobs globally?
Visit Tech Recruit
>
Looking for tech jobs in Europe?
Visit European Tech Recruit
Looking for tech jobs globally?
Visit Tech Recruit

Our Sectors

At European Recruitment, our sectors cover a wide range of industries within the field of technology

Submit Vacancy

Site Reliability Engineer

Recruitment Consultant
Bliss Verna
Contact Details
Posted
24 days ago

What we’re looking for

We need someone with 3+ years of experience in SRE, Production Engineering, or Infrastructure roles who has built and owned automation, observability, and tooling systems end-to-end in production. You should be comfortable working across a multi-cloud environment with strong distributed systems instincts and a track record of improving platform reliability and reducing operational burden. Bonus points if you have exposure to GPU/AI-ML infrastructure or accelerated compute workloads.

What you’ll do

  • Build and own the observability stack – dashboards, alerts, and distributed tracing using tools like OpenTelemetry, Prometheus, and Grafana – to provide high-granularity visibility into Mithril’s multi-cloud GPU orchestration platform

  • Define and implement SLIs and SLOs across Mithril’s API layer and internal orchestration services, partnering with Product and Platform teams to ensure new features are designed for operability from the start

  • Develop automation in Python (or Go) to eliminate repetitive operational tasks — from provider API reconciliation to automated health checks and capacity rebalancing

  • Maintain and extend Terraform/Pulumi modules and Kubernetes configurations to manage a growing multi-cloud provider footprint

  • Participate in on-call rotation, drive rigorous root cause analysis for production incidents, and implement durable fixes to prevent recurrence

  • Work directly with the founding engineering team to shape how infrastructure engineering operates as the company scales — this is a greenfield opportunity to build the playbook, not inherit a rigid system

Industry
Contract Type
Permanent
Location
United States
City
san francisco
Work Model
On-Site

Apply Now

By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.

For more information, please refer to our
Privacy Notice

    Name
    Email
    Phone
    Location
    Message

    Upload CV:

    Choose file

    Formats: Word, PDF (max. size: 20MB)

    Subscribe for industry highlights.

    Send Application

     

    Other relevant jobs

    Submit CV
    Submit Vacancy
    Cookie Settings
    We use cookies to enhance your experience and analyze site traffic and movements. Read our cookie policy here.