data
Posted 4 days agoMachine Learning Operations Engineer (MLOps)
at CodeRoad
Latin America, United StatesOn-site
Responsibilities
- Model Deployment & Integration: Design and implement scalable, secure, and production-grade pipelines for deploying agentic AI models.
- Optimize for performance, scalability, and reliability using CI/CD platforms. •
- Utilize specialized tools like Lantrace, AgentOps, or AWS's CloudWatch to detect and resolve issues related to model drift, latency, and bias in real-time. • •
- Facilitate iterative development and deployment of agentic AI solutions. •
Requirements
- Machine Learning Operations (MLOps) Engineer The Team
- We are seeking a skilled and innovative Machine Learning Operations (MLOps) Engineer with a focus on Agentic AI to design, deploy, and maintain scalable, robust, and ethical autonomous AI systems.
- The ideal candidate will combine deep expertise in modern MLOps practices with a solid understanding of agentic AI principles, enabling the seamless integration, monitoring, and optimization of AI models that exhibit autonomous decision-making and adaptability.
- You will be a key contributor in our cross-functional teams, ensuring our agentic AI solutions are reliable, efficient, and aligned with our business goals and ethical standards. Key Responsibilities •
- Cloud Infrastructure Management: Build and maintain robust cloud infrastructure on Google Cloud Platform (GCP) or Amazon Web Services (AWS) for the entire AI lifecycle.
- Leverage services like GCP's Vertex AI and Cloud Functions, or their AWS equivalents such as Amazon SageMaker, and AWS Lambda, to create efficient and resilient environments. •
- Automation & CI/CD: Develop and maintain automated workflows for continuous integration, continuous deployment (CI/CD), and continuous training (CT) of agentic AI models.
- Monitoring & Performance Optimization: Implement and manage advanced monitoring systems to track the performance, health, and decision-making accuracy of agentic AI models in production.
- Collaboration: Work closely with AI researchers, data scientists, software engineers, and product teams to align MLOps processes with project goals.
- Data & Model Governance: Establish and enforce robust data and model governance frameworks, ensuring data quality, security, and compliance with industry standards for all agentic AI systems.
- experience in MLOps, DevOps, or a related field, with at least 1 year focused on deploying and managing AI/ML models in production.
- Experience with agentic or autonomous AI systems is highly preferred. •
- experience with either Google Cloud Platform (GCP) or Amazon Web Services (AWS).
- Knowledge of relevant services such as GCP's Vertex AI, Cloud Storage, BigQuery, and Cloud Functions or AWS equivalents like Amazon SageMaker, S3, Redshift, and Lambda. •
- Technical Stack: (1 year or less)Strong knowledge of MLOps tools and frameworks(Pytorch, Langraph, CrewAI, N8N). Proficiency in containerization with Docker and orchestration with Kubernetes. •
- Programming & Scripting: Expertise in Python and familiarity with scripting for automation (e.g., Bash, Terraform). Strong
- experience with version control systems, particularly Git. •
- experience with modern monitoring tools like Lantrace, AgentOps, Prometheus, or AWS's CloudWatch and Grafana.
- Proven ability to track model performance, data drift, and system health in a production environment. •
- Security Mindset: A strong understanding of security principles related to cloud and MLOps, including Identity and Access Management (IAM), data encryption, and secure pipeline design. •
- Ethical AI Knowledge: Understanding of ethical AI principles, including bias detection, explainability, and compliance with regulations like GDPR or other relevant standards. •
- Collaboration & Communication: Strong interpersonal and communication skills, with the ability to work effectively in cross-functional teams and explain technical concepts clearly to diverse stakeholders.
- Education: Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field. Advanced degrees or certifications in MLOps, AI/ML, or cloud technologies are highly valued. What you’ll love: • 100% Remote •
Experience
- Experience: 4+ years of
Benefits
- Contractor position available for Latin American candidates • Holidays Off • Paid Time Off •
- Health insurance assistance program. •
- Competitive Pay (USD) •
Additional details
- At Coderoad, we're more than just a software development company—we're your gateway to the global tech world.
- Whether you're looking to skill up or level up your career, we offer the challenges you’ve been searching for.
- We provide end-to-end software development services and give you the opportunity to work on exciting, real-world projects in a supportive environment.
- Whether it's staff augmentation, dedicated IT teams, or general software engineering, we have opportunities for everyone to challenge themselves and take their career to the next level!
- Position Location - Latam (Remote). Time Zone
- Requirements - This team operates on the East/West Coast time zones. About the Role
- Focus on seamless integration with existing systems and enable real-time adaptability for autonomous decision-making. •
- Excellent teamwork and work environment • Training