engineering
Posted 4 days agoSoftware Engineer, Sensor Data Integration
at Mach9
San Francisco, United StatesOn-site
Responsibilities
- - Build CI/CD pipelines and automated checks that guarantee the correctness and consistency of data pipelines, including regression detection on dataset processing.
- - Build and maintain agentic harness for automated dataset triage and code patching.
Requirements
- Automatically propose or apply fixes, and escalate when human judgment is needed. - Work closely with ML and product teams to make data readily usable for training, inference and visualization. - Work closely with customers and data-provider partners to facilitate data integration (with occasional travels). - Puzzle-hunting: work with data formats with sparse or missing documentation.
- experience building production systems in Python. - Solid foundation in distributed systems and parallel computing. - Comfort operating with ambiguity — able to dig into undocumented or messy data formats, reverse-engineer how they work, and make steady progress without a clear spec. -
- Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching. - Strong communication and collaboration skills, with the ability to work across ML, product, and customer-facing teams. - Bachelor's degree in Computer Science, Engineering, or equivalent experience. BONUS QUALIFICATIONS -
- Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching. - Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf). - Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3). -
- Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch). - Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms). -
- Experience operating data pipelines that feed ML training and inference. - Familiar with C++.
Additional details
- THE ROLE At Mach9, Sensor Data Integration Engineers build the algorithms and pipelines that transform large-scale geospatial datasets into structured, accessible formats to power our survey product, Digital Surveyor.
- You’ll work with high-volume data sources — LiDAR-collected point clouds, on-road imagery, overhead aerial ortho photos — and own the systems that ingest, standardize and store them for our training and product use.
- Every single piece of data that our customers upload will pass through your systems first.
- This role is ideal for an engineer who loves puzzle-hunting — reverse-engineering sparsely-documented formats, wrangling coordinate systems and transforms, hunting down strange camera projection issues.
- You’ll sit at the divide between our customers and our product, making messy real-world sensor data trustworthy at scale.
- This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale.
- RESPONSIBILITIES - Develop and maintain scalable, reproducible workflows for ingesting and processing large volumes of point cloud, imagery, and geospatial data.
- - Convert datasets from various sensor providers into Mach9's standardized internal formats.
- - Optimize processing performance, query speed, and storage efficiency across large geospatial datasets.
- - Work closely with the customer success team to efficiently resolve issues and unblock customer projects.