infrastructure
Posted Feb 12ML Infrastructure Engineer
at Tennr
On-site
Responsibilities
- - Collaborate closely with ML engineers, software engineers, and cross-functional teams to ensure seamless integration of models with data pipelines and products.
- - Troubleshoot production issues and continuously improve systems to enhance performance and efficiency.
- - Create tooling for online and offline evaluation of ML & LLM systems. CANDIDATE
Requirements
- ROLE DESCRIPTION As the first and founding ML Operations Engineer at Tennr, you’ll play a crucial role in building and iterating on foundational Machine Learning and AI systems.
- You’ll own building machine learning training and inference pipelines that can handle increasing traffic demands and proliferation of product surface as we grow.
- You will be critical in ensuring our AI-driven healthcare platform is powered by robust, scalable, and efficiently deployed models.
- Our Machine Learning team owns and develops multiple in-house, proprietary VLMs, LLMs, and other models that are purpose-built for the ambitious problems we are solving in the healthcare space.
- You’ll make impactful contributions and influence fundamental elements of our ML and data systems, expanding Tennr’s ability to rapidly iterate and solve critical problems for patients and providers.
- - Develop and maintain infrastructure that supports efficient ML operations, including data pipelines, model evaluations, deployments, and training at scale.
- experience in ML model deployment, infrastructure, and scaling in production environments - Strong software engineering fundamentals, with proficiency in Python and TypeScript -
- Experience in software design and architecture for highly available ML systems for use cases like inference, evaluation, and experimentation - Strong knowledge of observability, including logging, metrics, tracing, model performance monitoring, and alerting -
- Experience with distributed systems, reliability, and production incident response - Comfortable working in ambiguity with high ownership, moving quickly in a fast-paced startup environment, and proactively driving projects from idea to production - Nice to have: -
- Experience working with ML CI/CD and common ML frameworks like Pytorch, Tensorflow, etc. -
- Experience working with common inference frameworks like vLLM, TensorRT, Triton, etc -