ML Ops Infrastructure Engineer

at Deepgram

United StatesRemote

Terraform Pulumi Python Kubernetes Docker Prometheus Grafana

Requirements

COMPANY OVERVIEW Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT), text-to-speech (TTS), and building production-grade voice agents at scale.
COMPANY OPERATING RHYTHM At Deepgram, we expect an AI-first mindset—AI use and comfort aren’t optional, they’re core to how we operate, innovate, and measure performance.
Every team member who works at Deepgram is expected to actively use and experiment with advanced AI tools, and even build your own into your everyday work.
We measure how effectively AI is applied to deliver results, and consistent, creative use of the latest AI capabilities is key to success here.
Candidates should be comfortable adopting new models and modes quickly, integrating AI into their workflows, and continuously pushing the boundaries of what these technologies can do.
Additionally, we move at the pace of AI.
THE OPPORTUNITY Getting a model from a research notebook to a production API serving millions of requests is one of the hardest problems in AI.
As an ML Ops Infrastructure Engineer at Deepgram, you will own the critical bridge between research and production -- building the pipelines, deployment systems, and testing infrastructure that take models from experimental to battle-tested at scale.
Your work ensures that every model improvement our research team makes can be safely, quickly, and reliably delivered to the customers who depend on Deepgram's APIs for real-time voice AI.
WHAT YOU'LL DO - Design and build CI/CD pipelines specifically tailored for ML model development, validation, and deployment - Architect and maintain model deployment pipelines that move models from research environments through staging to production with confidence - Build A/B testing infrastructure that enables controlled rollouts of new models and measures real-world performance impact - Implement comprehensive monitoring for model performance in production -- accuracy metrics, latency, drift detection,
experience in MLOps, DevOps, or infrastructure engineering with a focus on ML systems - Strong proficiency in Python and
experience building automation and tooling for ML workflows - Deep
experience with CI/CD systems and building pipelines for software and model delivery - Hands-on
experience with Docker and Kubernetes for containerized workload management - Practical
experience deploying and serving ML models in production environments - Familiarity with model evaluation, validation, and quality assurance processes - Understanding of monitoring and observability principles as applied to ML systems - Strong problem-solving skills and a bias toward automation over manual processes IT WOULD BE GREAT IF YOU HAD -
Experience with model serving frameworks such as NVIDIA Triton Inference Server, TensorRT, or ONNX Runtime - Background in speech, audio, or real-time media ML systems -
Experience with Infrastructure as Code tools such as Terraform or Pulumi - Hands-on
experience with monitoring and observability stacks (Prometheus, Grafana, Datadog, or similar) - Familiarity with GPU-accelerated inference optimization and profiling -
Experience with feature stores, data versioning, or ML metadata management - Knowledge of canary deployment strategies and progressive delivery for ML models BENEFITS & PERKS
If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you! Deepgram is an equal opportunity employer.

Benefits

HOLISTIC HEALTH - Medical, dental, vision
benefits - Annual wellness stipend - Mental health support - Life, STD, LTD Income Insurance Plans WORK/LIFE BLEND - Unlimited PTO - Parental leave - Flexible schedule - 12 Paid US company holidays - Quarterly personal productivity stipend - One-time stipend for home office upgrades - 401(k) plan with company match - Tax Savings Programs CONTINUOUS LEARNING - Learning / Education stipend - Participation in talks and conferences - Employee Resource Groups - AI enablement workshops / sessions *For candidates
Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $215M in total funding.

Additional details

More than 200,000 developers and 1,300+ organizations build voice offerings that are ‘Powered by Deepgram’, including Twilio, Cloudflare, Sierra, Decagon, Vapi, Daily, Cresta, Granola, and Jack in the Box.
Deepgram’s voice-native foundation models are accessed through cloud APIs or as self-hosted and on-premises software, with unmatched accuracy, low latency, and cost efficiency.
Backed by a recent Series C led by leading global investors and strategic partners, Deepgram has processed over 50,000 years of audio and transcribed more than 1 trillion words.
There is no organization in the world that understands voice better than Deepgram.
Change is rapid, and you can expect your day-to-day work to evolve just as quickly.
This may not be the right role if you’re not excited to experiment, adapt, think on your feet, and learn constantly, or if you’re seeking something highly prescriptive with a traditional 9-to-5.
benefits are administered locally and governed by country-specific regulations. Because of this,
benefits will differ by region — in some cases international employees receive
benefits US employees do not, and vice versa. As we scale, we will continue to evaluate where we can create more alignment, but a 1:1 global
benefits structure is not always legally or operationally possible.

ML Ops Infrastructure Engineer

Requirements

Benefits

Additional details

Browse by category

Browse by skills

Browse by role

Browse by location