Rethink Recruit

Founding Research Scientist - MLLM Training

Rethink Recruit  •  Seattle, WA (Onsite)  •  4 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About Nuance Labs

Nuance Labs is an early-stage deep tech startup building the first real-time human foundation model—a unified system across text, speech, and vision designed to make AI socially and emotionally intelligent.

We’re working toward AI that can read subtle human signals—a shift in tone, a glance, a pause—and respond in a way that feels natural and grounded in context. This is foundational work at the frontier of multimodal learning and real-time systems.

We’ve raised a $10M seed round backed by Accel, South Park Commons, Lightspeed, and top angels, and our team includes world-class researchers from MIT, UW, and Oxford with decades of experience at Apple and Meta, shipping ultra-low-latency ML systems used by millions.

Why This Role Exists

Large multimodal models are advancing quickly, but real-time, human-centered interaction remains unsolved Training models that can reason across text, speech, vision, and embodied signals—while operating under tight latency constraints—requires new approaches to architecture, data, and optimization.

This role exists to own and define how multimodal large language models are trained inside a broader human foundation model. As a Founding Research Scientist, you’ll set technical direction, design training strategies, and turn research ideas into systems that can operate in the real world.

This is a blank-page role with real agency. You’ll decide what problems matter, how we tackle them, and how research translates into working, scalable models.

What You’ll Be Building

You’ll help build the first human foundation model that operates across text, speech, facial expression, and body language in real time.

Your work will power systems that:

  • Understand fine-grained human signals across modalities and infer meaning in context

  • Reason autoregressively over multimodal inputs in real time

  • Drive lifelike avatars whose expressions, gestures, and tone evolve frame-by-frame during interaction

The field is wide open. Existing solutions treat language, voice, and vision as separate problems. This role offers the rare chance to define how these modalities are trained and unified at the foundation-model level.

What You’ll Own

You’ll operate as a founding-level researcher with end-to-end ownership over MLLM training and evaluation.

You will:

  • Design and train multimodal large language models and autoregressive architectures

  • Own the full ML pipeline, from dataset design and preprocessing to large-scale training and benchmarking

  • Develop training strategies that balance quality, generalization, and real-time performance

  • Push research breakthroughs into practical, production-oriented systems

  • Explore new architectures, objectives, and scaling strategies for multimodal reasoning

  • Write clean, maintainable research code that enables rapid iteration

  • Collaborate closely with researchers across speech, vision, and systems engineering

Who Will Thrive Here

You’re comfortable operating at the research frontier and making progress without a playbook. You care deeply about model behavior, but you’re equally motivated by getting things to work outside the lab.

You likely:

  • Enjoy blank-page research problems and defining technical direction

  • Move quickly from ideas to experiments to results

  • Think deeply about data, evaluation, and failure modes

  • Thrive in highly collaborative, cross-domain teams

Requirements

  • PhD or equivalent experience in multimodal LLMs, MLLM training, or closely related fields

  • Deep expertise in training large-scale autoregressive models

  • Strong command of modern deep learning and distributed training systems

  • Experience running the full ML lifecycle, from data curation to evaluation

  • Ability to translate research insights into practical systems

  • Strong coding skills and a commitment to clean, maintainable research code

  • Clear communication and strong collaboration skills

Nice to Have

  • Publications at top ML or multimodal AI conferences

  • Experience with real-time or low-latency ML systems

  • Prior work unifying language, vision, and/or speech models

  • Experience shipping large ML systems into production

Why Join Now

Joining Nuance Labs now means defining the training foundation of a category-defining AI system. You’ll have outsized influence over core research decisions, work in-person with a world-class team, and help solve one of the hardest problems in AI: real-time, multimodal human interaction.

Rethink Recruit

About Rethink Recruit

At Rethink Recruit, we bring a mix of old-school work ethic with a modern approach to recruitment. Our competitive edge comes from our niche focus on Autonomous Driving, EVs, AI, Robotics, Blockchain (FinTech), and the ability to adapt to new and emerging technologies.

For our clients - it allows them to tap into our talent pool of tens of thousands of pre-qualified, industry and skill-specific candidates with whom we have developed close working relationships over the past decade.

For our candidates - it allows them to spend less time on their job search by engaging with more industry-specific companies and more skill-specific jobs.

We believe that any agency can find marginal success by deploying a host of modern technologies and recruitment tools for outreach ($$$). Yet, what separates the good agencies from the great is how relevant their outreach is and how well they utilize their tools. We help bridge that gap by harnessing the power of modern technology to build long-standing relationships with thousands of diverse and incredibly talented people.

We care about what we do and the people we work with. Coupled with our high-level understanding of the technology and its applications, we stay at the forefront of the current market trends. So whether you are a candidate seeking a new role or a company looking to retain talent, please reach out to us, and we will look forward to working with you!

Industry
HR & Recruiting
Company Size
1-10 employees
Headquarters
Los Angeles, CA
Year Founded
2020
Social Media