- Build and optimize models using a variety of input data types, including tabular data, natural language, point clouds, and images, in support of fraud detection and identity verification use cases.
- Own data quality and integrity for critical datasets, implementing monitoring, validation checks, and anomaly detection to ensure reliable input to models and downstream decision systems.
- Collaborate closely with Product, Engineering, and Risk teams to define data requirements, shape roadmap priorities, and deliver insights that guide strategic decisions for fraud and identity products.
- Conduct in-depth research to explore new data sources and develop novel algorithms and features that advance the state of the art in fraud detection, identity resolution, and risk scoring.
Requirements
You will also leverage emerging approaches, including agentic AI and LLM-powered systems, to automate data analysis, accelerate insight generation, and scale how we evaluate identity data and detect fraud patterns.
- Leverage and build agentic AI and LLM-powered systems to automate data exploration, anomaly detection, vendor evaluation, and investigative workflows, increasing the speed and depth of insight generation.
- Lead the end-to-end ML/analytics lifecycle for assigned projects: problem definition, data exploration, feature engineering, modeling, evaluation, deployment handoff, and post-deployment monitoring where applicable.
- Stay current with advancements in AI, machine learning, and data infrastructure (including LLMs and agentic frameworks), and apply innovative techniques to real-world fraud and identity problems.
WHAT YOU BRING - Master’s or PhD in Computer Science, Statistics, Applied Mathematics, Data Science, or a related quantitative field; or equivalent professional experience. - 5+ years of
experience in data science, machine learning, or closely related roles, ideally in a high-growth tech or fintech environment. -
Experience in fraud prevention, risk modeling, or identity verification, including working with noisy, adversarial, or high-risk data environments. - Proven
experience working with large, messy, real-world datasets to generate insights and drive measurable business impact (not limited to pure model development). -
Experience working with diverse data modalities, such as tabular data, text/language, point clouds, and images, and selecting appropriate modeling approaches for each. - Strong proficiency in Python and SQL, with hands-on
experience using major ML libraries/frameworks (e.g., PyTorch, TensorFlow, scikit-learn) for model development and evaluation. - Deep understanding of machine learning algorithms, model evaluation techniques (e.g., AUC, lift, calibration, stability), and data pipeline development for both batch and near-real-time use cases. -
Experience building and maintaining data pipelines and workflows in distributed or large-scale environments (e.g., Spark, Airflow, Databricks, or similar technologies). - Demonstrated ability to evaluate and work with third-party data vendors or external datasets, including designing tests for data quality, coverage, stability, and incremental lift over existing signals. -
Experience with LLMs and agentic AI frameworks/infrastructure (e.g., LangChain, LangGraph, Ray) is strongly preferred; ability to design or extend agentic workflows for analytics and data quality use cases is a plus.
- Demonstrated ability to proactively deliver complex outcomes, lead technical workstreams, mentor others, and influence cross-functional decisions without formal authority.
- Excellent written and verbal communication skills, with the ability to translate complex data problems and model behavior into actionable business insights for both technical and non-technical audiences.
WHY SOCURE? Socure is building the identity trust infrastructure for the digital economy — verifying 100% of good identities in real time and stopping fraud before it starts.
The mission is big, the problems are complex, and the impact is felt by businesses, governments, and millions of people every day.
We hire people who want that level of responsibility.
People who move fast, think critically, act like owners, and care deeply about solving customer problems with precision.
If you want predictability or narrow scope, this won’t be your place.
If you want to help build the future of identity with a team that holds a high bar for itself — keep reading.
ABOUT THE ROLE We are seeking a highly analytical and impact-driven Senior Data Scientist to join our Data Science Data team at Socure.
In this role, you will work at the intersection of data, fraud risk, and identity verification, transforming raw, complex datasets into actionable insights that directly improve our products and decisioning systems.
You will own high-impact projects end to end: designing scalable data pipelines, building and evaluating models, and leading analytical deep-dives that shape how we use data to detect fraud and validate identity.
This is an advanced individual-contributor role (IC4 / Senior) that requires deep technical expertise, strong business judgment, and alignment with Socure’s leadership competencies, including continuous learning, effective communication, accountability, team development, decision making, and managing change.