data
Posted May 4Senior AI Engineer
at Fieldguide
San Francisco, United StatesRemote
Requirements
- ABOUT THE ROLE Fieldguide is building AI agents for the most complex audit and advisory workflows.
- As a Senior AI Engineer, you'll own meaningful product areas end-to-end—designing agentic architectures, building evaluation systems, and shipping agents that professionals trust with mission-critical work.
- This role is for engineers who have shipped LLM-powered features in production and are ready to take the lead on complex systems while mentoring those around them.
- WHAT YOU’LL OWN BUILD AND SHIP AI AGENTS - Design and build agentic systems that automate complex audit workflows end-to-end - Translate customer problems into concrete agent behaviors and orchestration logic - Orchestrate LLMs, tools, retrieval, and business logic into reliable, production-grade agent experiences - Own agents across their lifecycle: delivery, reliability, performance, and observability EXECUTE WITH AI-NATIVE LEVERAGE - Use AI to accelerate design, build, test, and iteration cycles -
- These principles resonate with you: - Bias to building: You move fast and resolve uncertainty by shipping - AI-native instincts: You treat LLMs, agents, and automation as core building blocks - Strong product judgment: You decide what matters and why—not just how to implement it - Learning velocity: You learn quickly from feedback and adjust based on data - Grounded optimism: You improve what's broken today and push toward what's possible next - End-to-end ownership: You understand production systems and
- EXPERIENCE We care more about capability and trajectory than years on a resume, but most strong candidates have: - 3–6+ years shipping production software in complex, real-world systems - Strong command of TypeScript, Python, and Postgres - Shipped LLM-powered features serving real production traffic - Built retrieval pipelines and agent orchestration systems - Implemented evaluation frameworks for model outputs and agent behavior - Worked with vector databases, embedding models, and RAG architectures -
- experience with modern LLM APIs (OpenAI, Gemini, Anthropic) and agent frameworks - Comfortable operating in ambiguity and taking responsibility for outcomes WHAT SHOULD EXCITE YOU - Enterprise-grade reliability: Building systems professionals depend on - Human-in-the-loop design: Knowing when to automate vs.
- when to surface decisions - Nuanced evaluation: Audits require judgment, so feedback structures matter - Explainability: Making AI outputs and reasoning transparent and trustworthy - Complex domains: Navigating compliance and enterprise rigor while moving fast - Shipping daily value: Delivering agent experiences customers use every day