data
Posted Nov 19, 2025Staff Data Engineer
at H1
New York, United StatesOn-site
Responsibilities
- - Design, build, and optimize large-scale data pipelines (hundreds of TBs) for performance, reliability, and cost efficiency.
Requirements
- At H1, we believe access to the best healthcare information is a basic human right.
- To accomplish this our teams harness the power of data and AI-technology to unlock groundbreaking medical insights and convert those insights into action that result in optimal patient outcomes and accelerates an equitable and inclusive drug development lifecycle.
- ABOUT YOU You’re a hands-on Staff IC and technical leader who thrives in complex data environments.
- You bring clarity to ambiguity, turn messy problems into reliable systems, and operate with a strong sense of ownership and impact.
- You collaborate effectively across functions, help others move faster, and are comfortable working across the full data and infrastructure stack. - You have a proven track record of leading large, complex technical projects from concept to production. - You bring deep
- Experience with large-scale data processing (e.g., Spark/PySpark on EMR or similar) or scalable distributed backend systems, with the ability to quickly deepen expertise in our data stack (PySpark, EMR, Hudi/Delta). - Strong proficiency in SQL, including writing and optimizing complex queries over large datasets. - Strong programming
- experience in Python (or a modern language with the ability to quickly ramp up in Python). -
- Experience designing systems or large-scale datasets/pipelines with attention to performance, reliability, and maintainability. - Hands-on
- experience with modern engineering workflows and tooling such as Git, JIRA, and CI/CD systems (e.g., CircleCI). - Comfort deploying and troubleshooting distributed workloads in cloud environments such as AWS EMR or Kubernetes. -
- Experience with workflow orchestration or job scheduling tools (e.g., Airflow, Argo). - Demonstrated ability to independently drive complex, cross-team technical initiatives and influence stakeholders without formal authority. -
- Experience with streaming/messaging technologies (e.g., Kafka, Kinesis) nice to have - Background in RWE, healthcare data, or other complex/regulated data domains is preferred -
Experience
- REQUIREMENTS -8+ years as a software, data, or backend engineer building and operating scalable, production-grade systems. -
Benefits
- This promotes health equity and builds needed trust in healthcare systems.
- - Own the end-to-end architecture for critical data assets, ensuring solutions are scalable, reliable, and aligned with H1’s long-term vision.
- Experience using AI-assisted coding tools (e.g., GitHub Copilot, Claude Code) to accelerate development while maintaining quality is encouraged COMPENSATION This role pays $170,000 to $190,000 per year, based on experience, in addition to stock options.
Additional details
- Our mission is to provide a platform that can optimally inform every doctor interaction globally.
- Visit h1.co to learn more about us. Data Engineering is responsible for the development and delivery of our most important asset, our data.
- Across thousands of data sources globally, the team ensures that only accurate, normalized data flows to our customers, at the speed required to match real-world changes.
- As we expand the markets we serve and increase the breadth and depth of data we capture, we need senior technical leaders who can drive execution, scalability, and architectural excellence.
- You’ll drive some of H1’s most visible data initiatives and help reduce bottlenecks across teams, providing critical technical leadership support during US hours.
- You will: - Act as a self-starter who drives execution independently, taking ownership and initiative with minimal need for day-to-day direction.
- - Partner with Product, Data Science, and downstream engineering teams to align priorities, manage dependencies, and deliver high-value outcomes.
- - Represent engineering in cross-functional forums, shaping roadmaps and reducing reliance on senior leadership for day-to-day decisions.
- - Develop deep domain expertise and mentor other engineers, helping raise the technical bar and influence the evolution of our data products.
- experience building and evolving large-scale data architectures, pipelines, or distributed systems.