data
Posted 5 days agoLead Data Engineer
at HackerRank
IndiaHybrid
Responsibilities
- Own and evolve the data platform - StarRocks (OLAP), Apache Hudi (Data Lake), Trino, Spark, and Apache Ranger - ensuring performance, reliability, and security at scale.
- Build the next-gen AI-optimised data layer: clean, structured datasets that power natural language querying and AI add-on features for HackerRank for Work customers.
- Own in-product data features - exports, insights dashboards, interview analytics, and the self-serve Custom Reports interface.
- Enforce robust data security - access controls, Apache Ranger policies, and confidence-scoring guardrails for AI-generated outputs.
- Lead technical design reviews and define engineering standards for the data team.
Requirements
- Software has entered an era where humans and AI build side by side.
- HackerRank's data platform is at an inflection point.
- We've completed a multi-year modernisation - migrating from Redshift to StarRocks + Apache Hudi - and cut export latencies from 25 seconds to under 5 seconds.
- Now we're building the AI-native data layer that will power revenue-generating features like natural language querying for HackerRank for Work customers.
- As Lead Data Engineer, you'll be a senior individual contributor at the heart of the data organisation - owning complex platform decisions, collaborating cross-functionally with AI, product, and go-to-market teams, and shipping data-driven features that directly drive revenue.
- Enable self-service pipelines for internal teams (AI platform, analytics, go-to-market), reducing ad-hoc data requests and scaling data access across the org.
- Partner with PMs and business stakeholders to proactively identify and scope AI-enabled data use cases. Who you are
- Deep hands-on expertise with OLAP databases - StarRocks, ClickHouse, Druid, or similar. Strong
- experience with data lake technologies - Apache Hudi, Iceberg, or Delta Lake.
- Proficient with distributed query engines (Trino / Presto) and batch/streaming compute with Apache Spark.
- Solid understanding of data security, RBAC, and access control tools like Apache Ranger.
- Comfortable working in a hybrid AWS + open-source self-managed environment.