Multimodal Generative AI Researcher
Location: Remote
About the Role
We’re looking for a Research Scientist with deep expertise in training and fine-tuning large Vision-Language and Language Models (VLMs / LLMs) for downstream multimodal tasks. You’ll help push the next frontier of models that reason across vision, language, and 3D, bridging research breakthroughs with scalable engineering.
What You’ll Do
What You Bring
Bonus / Preferred
Equal Employment Opportunity:
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

Stability AI is the enterprise-ready creative partner for teams and creators, delivering professional-grade generative AI tools and solutions for media generation and editing across image, video, 3D, and audio to enable creative production at scale. Stability AI sparked the generative AI revolution with the release of Stable Diffusion in August 2022, putting generative technology in the hands of millions of creators globally and cementing its position as a leader in the field. Stable Diffusion models have since been downloaded more than 350 million times.
Recognized by Fortune as one of the 50 AI Innovators and by TIME as one of the Most Influential Companies, with Stable Audio named to TIME’s Best Inventions list. In June 2024, Stability AI entered its next phase of growth with the appointment of a renowned leadership team: Sean Parker as Executive Chairman, Prem Akkaraju as CEO, and James Cameron as Board Member.
For press inquiries, contact: press@stability.ai. For customer support, contact: support@stability.ai.