Research Engineer, Voice

at Inflection AI

United StatesOn-site

PyTorch

Responsibilities

Research, develop, and optimize neural models for voice and audio—including text-to-speech, automatic speech recognition, audio generation, and spoken dialogue systems.
Build and maintain production-grade training and inference pipelines for voice models, with close attention to latency, naturalness, and scalability.
Run experiments end-to-end: data curation, model architecture design, training, evaluation, and ablation studies.
Collaborate with ML engineers, product teams, and infrastructure to integrate voice models into Pi’s real-time conversational stack.
Develop robust evaluation frameworks combining perceptual metrics, automated benchmarks, and user-facing quality signals.

Requirements

Inflection AI is a Public Benefit Corporation empowering people with human-centered, emotionally intelligent AI.
We’re shaping the future of AI by combining emotional intelligence (EQ) and raw intelligence (IQ) to elevate people’s potential.
Inflection AI created Pi, the world’s first emotionally intelligent AI, to help people work through decisions, emotions, and challenges.
Pi is a personal AI agent powered by Inflection AI’s foundation model, proving that AI can be personal, empathetic, and contextually aware. About the Role
You’ll collaborate closely with ML engineers, product teams, and infrastructure to turn cutting-edge ideas in areas like neural audio codecs, diffusion-based TTS, and multimodal foundation models into the natural, expressive voice experiences that millions of Pi users interact with every day. What You’ll Do
Explore and apply advances in neural audio codecs, diffusion-based synthesis, streaming architectures, and multimodal foundation models to improve Pi’s voice experience.
experience (including graduate work) in audio, speech, or multimodal ML.
Strong proficiency in PyTorch and hands-on
experience training and debugging large-scale neural models on GPU/accelerator clusters.
Solid understanding of audio and speech fundamentals spectrograms, mel features, vocoders, codec-based representations, and signal processing.
Demonstrated ability to take a research idea from prototype to production: equally comfortable reading papers and writing efficient, CUDA-aware training loops.

Research Engineer, Voice

Responsibilities

Requirements

Browse by category

Browse by skills

Browse by role

Experience

Benefits

Additional details

Browse by location