Design, build, and maintain the data products that support R&D, analytics, Lab and scientific workflows, from initial design through deployment and iterations •
Build and maintain data pipelines for large and complex datasets, from raw inputs through derived and analysis-ready datasets. •
Design scalable data models to power analytics, reporting, and downstream applications. Maintain high standards of data quality, accuracy, lineage, and observability across data pipelines. •
Optimize storage, retrieval, and lifecycle management for large scientific files (E.g. sequencing data, intermediate artifacts, derived datasets). •
Drive rapid prototyping efforts to support exploratory, proof-of-concepts, and early-stage initiatives, while guiding the transition to production-grade systems. •
Implement best practices for data quality, validation, lineage, observability, and reproducibility to enable a trusted 360° view. •
Collaborate with product managers and domain experts to translate
Establish golden paths (templates, examples, docs) and contribute to shared data product catalogs, patterns, and best practices used by other engineers •
Requirements
Natera is seeking an experienced Senior Software Engineer with modern data engineering and AI-enabled development skills with deep scientific R&D background to design and build data products that directly support genomics research and translational science .
The ideal candidate combines strong data engineering skills with a computer science background and hands-on
experience in bioinformatics, genomics, or computational biology, and the ability to work independently in an R&D environment.
Apply domain knowledge in genetics and bioinformatics to design data models, schemas, and abstractions that align with real research patterns and downstream analysis needs. •
Bachelor’s or Master’s degree in computer science or bioinformatics with healthcare or biotech data domain experience preferred • 8+ years of
experience in data engineering, designing and maintaining data pipelines and cloud data architectures (e.g, Snowflake, AWS, etc) •
Strong background in bioinformatics, genomics, or computational biology (required). Understands key genomics and bioinformatics data formats, such as BAM, VCF, FASTQ, common compression techniques for these file formats, and their storage, delivery, and management needs. • Demonstrated
experience supporting scientific R&D, Lab workflows and research teams with production-grade data systems. •
Strong proficiency in Python, SQL, and distributed processing frameworks (Spark or equivalent) •
Experience with modern orchestration tools (Airflow, dbt, Dagster) •
Experience leveraging AI-assisted development tools (e.g., LLM copilots) to accelerate data solution development •
Familiarity with building data products that support analytics, ML, or AI applications •
Experience implementing CI/CD for data pipelines and IaC (Terraform, CloudFormation); Knowledge of data observability, testing, and data quality frameworks •
Ability to evaluate emerging data and AI technologies and recommend scalable solutions •
Exposure to vector databases, embeddings, semantic search, or RAG-based architectures is a plus •
Proven ability to operate effectively in fast-paced environments, balancing speed, rigor, and compliance •
Strong written and verbal communication skills with ability to collaborate across engineering, analytics, and business stakeholders •
Experience working with healthcare, life sciences, or other highly regulated data, including hands-on HIPAA compliance. #LI-DNI
Natera™ is a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health.
Benefits
The pay range is listed and actual compensation packages are based on a wide array of factors unique to each candidate, including but not limited to skill set, years & depth of experience, certifications and specific office location.
Remote USA $125,000 — $156,300 USD OUR OPPORTUNITY
Our aim is to make personalized genetic testing and diagnostics part of the standard of care to protect health and enable earlier and more targeted interventions that lead to longer, healthier lives.
benefits include comprehensive medical, dental, vision, life and disability plans for eligible employees and their dependents.
Additionally, Natera employees and their immediate families receive free testing in addition to fertility care benefits. Other
benefits include pregnancy and baby bonding leave, 401k benefits, commuter
Please be advised that Natera will reach out to candidates with a @ natera.com email domain ONLY.
Additional details
This role is intended for someone who already understands how research organizations operate, how genomic data flows from experiment to insight, and how to engineer data systems that accelerate discovery without compromising rigor or compliance.
You will also be comfortable moving quickly to prototype novel data products while ensuring solutions evolve into robust, compliant, and scalable platforms.
You will bring an internalized sense of what “good” looks like for research data: reproducibility, traceability, performance, and scientific usability. Key Responsibilities •
requirements while remaining usable for research. •
Partner closely with R&D scientists, bioinformatics teams, and software engineers to translate research needs into well-structured, reusable data assets. •
Provide technical guidance and mentorship to mid-level engineers Required
Strong data modeling expertise (dimensional, normalized, healthcare-specific schemas) •
Demonstrated ownership of production-grade data systems and end-to-end pipeline lifecycle •
This may differ in other locations due to cost of labor considerations.
The Natera team consists of highly dedicated statisticians, geneticists, doctors, laboratory scientists, business professionals, software engineers and many other professionals from world-class institutions, who care deeply for our work and each other.