Job Description
The Seed Multimodal Interaction and World Model team is dedicated to developing models that boast human-level multimodal understanding and interaction capabilities. The team also aspires to advance the exploration and development of multimodal assistant products.
We are looking for talented individuals to join us for an internship in 2026. PhD Internships at our Company aim to provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies.
PhD internships at Our Company provides students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community-building and development events, and collaboration with industry experts.
Applications will be reviewed on a rolling basis - we encourage you to apply early. Please state your availability clearly in your resume (Start date, End date).
Responsibilities:
- Design and implement reinforcement learning (RL) training systems for large-scale multimodal foundation models
- Develop unified modeling frameworks that integrate video, audio, and language, with a focus on visual latent reasoning
- Explore RL-based approaches to bridge understanding and generation for multimodal visual reasoning
- Collaborate with researchers to evaluate models on tasks involving world modeling, reasoning, and instruction-conditioned generation
annually.