infrastructure
Posted 1 weeks agoSite Reliability Engineer
at Unstructured
GlobalRemote
Requirements
- Unstructured is defining the standard for enterprise data transformation in the age of LLMs and generative AI.
- Our open-source toolkit has been downloaded 61M+ times and is used by 90% of the Fortune 1000.
- We power production AI workflows across commercial and federal sectors — transforming PDFs, HTML, Word docs, images, emails, and more into AI-ready data pipelines that scale.
- We're not just building tools, we're building the backbone of generative AI and the infrastructure that unlocks intelligence across industries.
- As part of the Infra team, you'll work on the reliability and performance of our platform end-to-end.
- WHAT YOU'LL OWN & DRIVE - Own production reliability across our Knative, KEDA, and Kubernetes-based document processing platform.
- You've written HPA configs, KEDA ScaledObjects, PodDisruptionBudgets, preStop hooks, and PriorityClasses; you understand pod lifecycle and scheduler behavior. - Demonstrated
- experience diagnosing and resolving real production performance issues: resource saturation, timeout failures, scheduling problems, graceful shutdown gaps - Enough Python or Go to read service code, trace a bug to root cause, and write a targeted fix WHY YOU'LL LOVE IT HERE You'll be surrounded by smart, kind, low-ego people who genuinely enjoy building together.
Benefits
- From medical, dental, and vision coverage effective the 1st of the month following your start date, life and disability insurance, unlimited PTO, and flexible parental leave, to a 401(k) with company match, equity, a $500 work from home stipend, $70/month internet reimbursement, and team/company offsites throughout the year - we want you focused on building, growing, and staying energized for the long haul.
Additional details
- ABOUT THE ROLE The infra team is small, technically deep, and owns the full stack from cloud provisioning and k8s operators to workflow orchestration and core services.
- We ship frequently and operate at a scale that makes reliability a first-class engineering problem.
- The work is technical and hands-on: you'll write code, dig into k8s internals, and hold yourself accountable to positive production outcomes.