infrastructure
Posted Mar 25Senior Data Platform Engineer
at Aspora
Bangalore, IndiaOn-site
Responsibilities
- - Own cluster management, autoscaling policies, and resource governance across Databricks workspaces.
- ETL / ELT Pipeline Engineering - Design and build robust, idempotent, and testable data pipelines handling batch and near-real-time workloads.
- - Implement and maintain CDC pipelines (Debezium, Kafka Connect, or native DB replication) ensuring low-latency, high-fidelity data propagation.
- - Own the evolution of our analytical warehouse/lakehouse stack — performance benchmarking, cost modelling, and technology selection.
- - Build and maintain efficient data serving layers for dashboards, ML feature stores, and reverse ETL use cases.
- - Implement data retention, archival, and lifecycle management policies across hot/warm/cold storage tiers. 4.
- Experience - Define and enforce data platform engineering best practices — code standards, CI/CD for pipelines, automated testing, and observability. - Build internal tooling and libraries that make data engineers faster: reusable Spark utilities, pipeline templates, local dev environments. - Champion data reliability engineering: lineage tracking, incident response playbooks, pipeline SLO monitoring, and root cause analysis.
- - Own backend design and execution, solving complex engineering problems at scale.
Requirements
- You'll work closely with analytics, ML, and product engineering teams — setting the bar for reliability, performance, and data quality across the platform. What You'll Do 1.
- Tech-Stack | Area | Tools | Compute | Apache Spark, Databricks, PySpark, Scala | Orchestration | Apache Airflow, dbt | Ingestion & CDC | Debezium, Kafka, Kafka Connect | Storage | Delta Lake, Iceberg, S3/GCS, Snowflake | Languages | Python, SQL, Scala | Observability | Great Expectations, OpenLineage, Monte Carlo | What We're Looking For - 5+ years of data engineering
- experience with 2+ years on large-scale big data platforms. - Hands-on expertise with Apache Spark — performance tuning, partitioning, broadcast joins, execution plans. - Deep Databricks
- experience — workspace configuration, Unity Catalog, Delta Live Tables, or equivalent. - Solid Apache Airflow
- experience implementing CDC pipelines (Debezium, Kafka Connect, or DMS). - Strong proficiency in Python and SQL. -
- Experience designing analytical data models for large datasets (star schema, wide tables, aggregation layers). - Track record of building reliable, observable, and testable pipelines in production.
- experience with modern data lake technologies like Delta Lake or Apache Iceberg, including compaction, time travel, and schema evolution -
- Experience building and operating streaming data pipelines using Apache Spark Structured Streaming, Apache Flink, or Kafka Streams - Proficiency with dbt for data transformations and lineage management -
- Experience working with cloud data infrastructure on Amazon Web Services, Google Cloud Platform, or Microsoft Azure - Familiarity with infrastructure-as-code tools such as Terraform or AWS CloudFormation -
- Experience owning data platform reliability end-to-end, including monitoring, alerting, and building self-healing systems - A strong data-as-a-product mindset, with emphasis on clear contracts, versioned schemas, SLOs, and well-documented datasets - A bias toward automation—proactively reducing operational toil by building scalable frameworks and tooling - Solid engineering fundamentals, including writing testable code, participating in rigorous code reviews, and maintaining high standards for operational
Benefits
- - Competitive ESOPs—align your growth with Aspora’s long-term vision.
- - Health insurance, strong leave policies, and career growth opportunities in a high-impact startup
Additional details
- ABOUT ASPORA People on the move deserve a bank that moves with them.
- Since 2022, Aspora has been building a borderless financial operating system that makes money as mobile and transparent as its users.
- Backed by influential venture capitalists like Sequoia Capital, Greylock Partners, Hummingbird Ventures, Y Combinator & Global Founders Capital.
- We're a team of 150+ across India, the UK, the UAE, EU and the US, working with extreme ownership, radical candour, and an obsession with customer impact.
- We celebrate builders who question assumptions, ship fast, and turn regulatory complexity into elegant solutions.
- If you’re driven to redefine what global banking can be, we’d love to build the future with you.
- About the Role We're building the data infrastructure that powers decisions across every part of our business — from real-time analytics to large-scale batch computation.
- As a Senior Data Platform Engineer, you'll own the systems that process billions of events, move data reliably, and make insights fast to produce.
- - Architect and maintain lakehouse solutions (Delta Lake, Iceberg) including partitioning strategies, Z-ordering, and compaction jobs.
- - Drive platform-level improvements: query optimisation, caching strategies, compute–storage separation, and shuffle tuning. 2.