infrastructure
Posted 1 weeks agoMember of Technical Staff, DevOps
at Vapi
San Francisco, United StatesRemote
Responsibilities
- Land a quality-of-life improvement to the deploy pipeline. - 60 Day: Own progressive delivery end-to-end — canary, automated rollback, soak — for at least one critical service path.
- Ship the first version of cell-creation tooling or preview environments.
- Establish SLAs and a feedback loop with engineering teams.
- Own the developer-platform roadmap and partner with Infra and SRE on cell creation, multi-region rollouts, and oncall tooling.
Requirements
- Voice AI that resolves, not transfers.
- Vapi engineers are your users, and the deploy pipeline, preview environments, cell-creation tooling, and oncall tooling are products with SLAs, docs, and feedback loops. - You’ll own progressive delivery (canary, blue/green, automated rollback, soak periods), the GitOps story across multiple clusters and regions, and the on-demand environment tooling that’s on the Q3 roadmap.
- WHAT YOU’LL DO: - 30 Day: Get fluent in the Pulumi stacks, the ArgoCD setup, and GitHub Actions pipelines.
- WHO YOU ARE: Must-haves - You have a platform-as-a-product mindset — you treat internal engineers as customers, with SLAs, docs, and feedback loops, not tickets and ad-hoc help.
- - You’ve operated Pulumi (TypeScript) or Terraform at scale (40+ stacks, multi-region) and you’ve felt the pain when IaC sprawl gets ahead of you.
- - You’ve run ArgoCD or equivalent GitOps for deploying applications across multiple clusters.
- You can describe a real rollout that automated rollback caught. - You’ve designed CI/CD pipelines (GitHub Actions preferred) for many services and Dockerfiles, not just one repo. - You’ve built deploy tooling for on-demand environments — preview envs, dev deployments, or cell creation.
- Nice-to-haves - You’ve written Go for platform services (Vapi’s canary-manager is Go). - You’ve operated developer platforms at a mid-stage infra-heavy company or a DevEx team at a larger shop.
- Tech stack you’ll work in - Languages: TypeScript (primary, for Pulumi and tooling), Go (for canary-manager and platform services), Bash.
- - IaC: Pulumi (TypeScript) at scale (40+ stacks across regions), Terraform.
- - GitOps and deploy: ArgoCD (multi-cluster), GitHub Actions, 15+ Dockerfiles.
- - Progressive delivery: canary, blue/green, automated rollback, soak periods (canary-manager Go service).