data
Posted Apr 30AI Engineer - Public Sector
at Unstructured
GlobalRemote
Requirements
- Unstructured is defining the standard for enterprise data transformation in the age of LLMs and generative AI.
- Our open-source toolkit has been downloaded 61M+ times and is used by 90% of the Fortune 1000.
- We power production AI workflows across commercial and federal sectors — transforming PDFs, HTML, Word docs, images, emails, and more into AI-ready data pipelines that scale.
- We're not just building tools, we're building the backbone of generative AI and the infrastructure that unlocks intelligence across industries.
- We are looking for an AI Engineer who thrives at the intersection of R&D and production-grade software engineering.
- WHAT YOU'LL OWN & DRIVE - You will be a high-agency individual contributor, owning the lifecycle of AI solutions from initial research to AWS deployment. - 50% Building & Shipping: Design and implement production-grade RAG pipelines and agentic workflows using Python.
- WHAT WE'RE LOOKING FOR We are looking for a self-directed engineer who excels in high-stakes, ambiguous environments.
- You have a track record of moving AI models out of notebooks and into production environments where latency, cost, and accuracy are treated as first-class citizens. - Technical Resourcefulness: You are comfortable working in restricted or air-gapped environments.
- You don't require constant oversight to identify the right tool for a job, whether it’s a specific vector database or a custom multimodal pipeline. - A "Generalist" Mindset: While you specialize in AI, you understand the full stack.
- You are as comfortable discussing embedding strategies as you are configuring AWS GovCloud infrastructure or debugging a FastAPI endpoint. MUST-HAVES - Proven
- experience deploying Production RAG pipelines against real-world, messy datasets. - Deep expertise in Agentic system design (tool-use, multi-agent orchestration). - Strong Python engineering skills—writing clean, scalable, and maintainable code -
- Experience operating within AWS/GovCloud environments. NICE-TO-HAVES -
- Experience fine-tuning NLP or object detection models. - Familiarity with LLM evaluation frameworks (hallucination detection, drift monitoring). - Knowledge of government security standards and working in different classification environments and on-prem - Security Clearance: Existing Secret/TS clearance or eligibility is a significant plus.
- YOUR TECHNICAL TOOLKIT - Languages: Python (expert-level), SQL - LLM & Agentic Frameworks: LangChain, LangGraph, CrewAI, or similar orchestration frameworks - RAG Stack: Retrieval with vector databases (Pinecone, Weaviate, Chroma, pgvector), graph databases (Neo4J), Elasticsearch, BM25, and Sentence-Transformers; NLP enrichment with spaCy, GLiNER, and Transformers; optimization using embedding models, reranking pipelines, and DSPy - Evaluation & Observability: RAGAS, DeepEval, Arize Phoenix, and synthetic