infrastructure
Posted Feb 23Software Engineer, Infrastructure
at Exa
Singapore, SingaporeOn-site
Requirements
- Exa is an applied AI lab building a search engine unlike the world has ever seen.
- If you want to build massive-scale ML systems that will define the way the new AI world consumes information, this is the place for you.
- That could mean building GPU cluster orchestration in Kubernetes, map-reduce batchjobs on Ray, or the best observability tooling in the world.
- experience designing and operating large-scale infrastructure - GPU clusters or large Kubernetes clusters or cloud batchjob systems - You bring an obsessive mindset — always thinking about reliability, observability, and optimization across the entire stack.
- WHAT YOU'LL DO - Build the Kubernetes orchestration on a $20M GPU cluster - Scale our AWS batchjob system to handle map reduce jobs over 10s of thousands of machines - Design GPU scheduling software so we max out our cluster utilization - Build observability into our production systems LOGISTICS - Location: This is an in-person opportunity in Singapore.
Benefits
- We now power search for Cursor, Cognition, HubSpot, and over 400,000 developers and have raised $350m from Lightspeed, Benchmark, and a16z.
- While we cannot guarantee your visa, we have historically been successful in sponsoring candidates from all over the world.
- If you receive an offer, our team will work hard to get you a visa. -
- Benefits: We offer premium healthcare
- benefits (medical, dental, vision), fertility benefits, 16 weeks of fully paid parental leave for all new parents, and a monthly wellness stipend to all of our employees.
Additional details
- We build massive-scale infra to crawl the entire web, train state-of-the-art embedding models to process it, and design super high performant vector databases to retrieve over it.
- Our ultimate goal is to build perfect search over all the world's information, far beyond Google.
- Our Infrastructure Team builds the underlying tooling and infrastructure that powers all Exa's systems.
- Basically, we need more infra engineers to build the machine that builds the machine so that we can move as fast as possible as an engineering org.