research
Posted May 15AI Researcher (Multimodal Audio/Video Generation)
at Tavus
San Francisco, United StatesHybrid
Responsibilities
- - Design models that are coupled with conversation flow — capturing and generating verbal + non-verbal signals in sync.
- - Drive innovation in diffusion models, long-video generation, and audio-visual modeling.
Requirements
- We’re building AI Humans: a new interface that closes the gap between people and machines, free from the friction of today’s systems.
- AI Humans combine the emotional intelligence of humans with the reach and reliability of machines, making them capable, trusted agents available 24/7, in every language, on our terms.
- With Tavus, individuals, enterprises, and developers can all build AI Humans to connect, understand, and act with empathy at scale.
- The Role We’re hiring a Senior AI Researcher to lead research in audio-visual avatar generation.
- This role is for someone who thrives in ambiguity, has a track record of pushing generative models to new frontiers, and wants to define what human–AI interaction looks like in practice.
- - Translate research into production by partnering with Applied ML and engineering.
- experience applying generative models at scale. - Expertise in diffusion models and awareness of the latest efficiency techniques. -
- Experience in multimodal generation — spanning video, audio, and language. - Proven innovation in long-video generation and/or audio generation. - Excellent programming skills — fluent in PyTorch and GPU-optimized workflows. - Track record of publications in top-tier venues (CVPR, NeurIPS, BMVC, ICASSP, etc.). -
- Experience leading research activities or mentoring teams.
- Nice-to-Haves - Skills in 3D graphics, Gaussian splatting, or large-scale training setups. - Broad exposure to generative AI models beyond your specialty. - Familiarity with software development best practices.
Experience
- You’ll Bring: - A PhD or equivalent research experience, plus 2–3+ years of hands-on
Contact
- ABOUT US Tavus https://www.tavus.io/ is a research lab pioneering human computing.
Additional details
- Our real-time human simulation models let machines see, hear, respond, and even look real—enabling meaningful, face-to-face conversations.
- A fleet of medical assistants that can give every patient the attention they need.
- We’re a Series A company backed by world-class investors including Sequoia Capital, Y Combinator, and Scale Venture Partners.
- Be part of shaping a future where humans and machines truly understand each other.
- Your Mission 🚀 - Lead research efforts on audio-visual generation for avatars (Neural Avatars, Talking-Heads), with a focus on conversational settings.
- - Mentor researchers, set research directions, and publish impactful work.
- Location Preferred: San Francisco (hybrid) or London (office opening soon).
- Remote within U.S. or Europe considered for exceptional candidates.