Rethink Recruit

Founding Research Scientist - Speech Synthesis

Rethink Recruit  •  Seattle, WA (Onsite)  •  4 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About Nuance Labs

Nuance Labs is an early-stage deep tech startup building the first real-time human foundation model—a unified system across text, speech, and vision designed to make AI socially and emotionally intelligent.

We’re working toward AI that understands subtle human signals—a shift in tone, a hesitant pause, a quirked eyebrow—and responds in a way that feels genuinely human. This is foundational work at the intersection of speech, multimodal learning, and real-time systems.

We’re backed by a $10M seed round from Accel, South Park Commons, Lightspeed, and top angels, and our team includes world-class researchers from MIT, UW, and Oxford with decades of experience at Apple and Meta, shipping ultra-low-latency ML systems used by millions.

Why This Role Exists

Speech is at the core of human interaction—and it’s the backbone of truly human AI. While today’s voice systems have made progress on prosody and naturalness, real-time, emotionally grounded, multimodal speech generation remains unsolved.

This role exists to own and push the frontier of speech synthesis inside a broader human foundation model. As a Founding Research Scientist, you’ll help define how speech models are trained, evaluated, and integrated into a real-time system that unifies voice, language, and expression.

This is a blank-page role with real agency. You’ll help decide what problems matter, how we approach them, and how research turns into systems that actually work in the world.

What You’ll Be Building

You’ll help create the first human foundation model that operates across text, speech, facial expression, and body language in real time.

Your work will contribute to systems that:

  • Understand fine-grained human signals, from vocal nuance to subtle changes in expression

  • Generate lifelike, responsive speech that adapts frame-by-frame to context and emotion

  • Power real-time avatars whose voice, tone, and expression evolve naturally in interaction

This is a rare opportunity to shape foundational technology in a space where the boundaries are still being defined.

What You’ll Own

You’ll operate as a founding-level researcher with end-to-end ownership over speech synthesis research and its path to production.

You will:

  • Design, train, and evaluate state-of-the-art speech synthesis and audio generation models

  • Own the full ML pipeline, from data wrangling and rapid prototyping to large-scale training and benchmarking

  • Push research breakthroughs into practical, real-time systems

  • Explore new architectures and training strategies for expressive, low-latency speech generation

  • Write clean, maintainable research code that supports fast iteration

  • Collaborate closely with researchers across vision, language, and multimodal modeling

Who Will Thrive Here

You’re someone who loves frontier research—but you also care deeply about whether things actually work. You’re comfortable with ambiguity, motivated by unsolved problems, and excited to chart your own course.

You likely:

  • Enjoy blank-page research problems and setting your own technical direction

  • Move quickly from ideas to experiments to results

  • Care about both model quality and real-world constraints like latency and stability

  • Thrive alongside other highly driven, deeply technical collaborators

Requirements

  • PhD or equivalent experience in speech synthesis, audio generation, or closely related fields

  • Deep expertise in training speech or audio models (e.g., TTS, speech-to-speech, neural vocoders)

  • Strong command of modern deep learning methods and large-scale training workflows

  • Experience running the full ML lifecycle, from dataset construction through evaluation

  • Ability to translate research insights into working systems

  • Strong coding skills and a commitment to clean, maintainable research code

  • Clear communication and strong collaboration skills

Nice to Have

  • Publications at top ML, speech, or audio conferences

  • Experience with real-time or low-latency ML systems

  • Prior work on multimodal models involving speech, vision, or language

  • Experience shipping ML systems used by real users

Why Join Now

Joining Nuance Labs now means shaping the core research direction of a company tackling one of the hardest problems in AI: real-time, emotionally intelligent human interaction.

You’ll have outsized ownership, direct influence on foundational systems, and the chance to work in-person with a world-class team that blends frontier research with product-grade engineering. If you want your research to define a new category—not just incrementally improve an existing one—this role offers that opportunity.

Rethink Recruit

About Rethink Recruit

At Rethink Recruit, we bring a mix of old-school work ethic with a modern approach to recruitment. Our competitive edge comes from our niche focus on Autonomous Driving, EVs, AI, Robotics, Blockchain (FinTech), and the ability to adapt to new and emerging technologies.

For our clients - it allows them to tap into our talent pool of tens of thousands of pre-qualified, industry and skill-specific candidates with whom we have developed close working relationships over the past decade.

For our candidates - it allows them to spend less time on their job search by engaging with more industry-specific companies and more skill-specific jobs.

We believe that any agency can find marginal success by deploying a host of modern technologies and recruitment tools for outreach ($$$). Yet, what separates the good agencies from the great is how relevant their outreach is and how well they utilize their tools. We help bridge that gap by harnessing the power of modern technology to build long-standing relationships with thousands of diverse and incredibly talented people.

We care about what we do and the people we work with. Coupled with our high-level understanding of the technology and its applications, we stay at the forefront of the current market trends. So whether you are a candidate seeking a new role or a company looking to retain talent, please reach out to us, and we will look forward to working with you!

Industry
HR & Recruiting
Company Size
1-10 employees
Headquarters
Los Angeles, CA
Year Founded
2020
Social Media