Z.ai

社招-AI院-强化学习训练框架工程师

Z.ai  •  Beijing, CN (Onsite)  •  5 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

社招-AI院-强化学习训练框架工程师北京全职互联网 / 电子 / 网游职位描述1. 负责强化学习训练框架的研发、优化和维护,根据业务需求持续改进训练框架和策略,提升模型训练效率
2. 分析和定位训练中的性能瓶颈,实施针对性优化措施,提升训练效率和稳定性
3. 跟进业界技术进展,不断同步与集成最新训练优化策略
职位要求1. 本科及以上学历,计算机相关专业,2-5 年工作经验
2. 对自然语言处理、计算机视觉和多模态算法有深入理解,熟悉主流的 LLM 模型架构,有分布式训练经验
3. 对常见 RL 训练算法有基本了解
4. 熟悉 vllm 或 sglang 等常用开源推理框架的优先考虑
更多信息:团队工作介绍
GLM-4.5: Reasoning, Coding, and Agentic Abililties
- https://z.ai/blog/glm-4.5
- GLM-4.5 is built with 355 billion total parameters and 32 billion active parameters, and GLM-4.5-Air with 106 billion total parameters and 12 billion active parameters. Both are designed to unify reasoning, coding, and agentic capabilities into a single model in order to satisfy more and more complicated requirements of fast rising agentic applications.
slime: An SGLang-Native Post-Training Framework for RL Scaling
- https://lmsys.org/blog/2025-07-09-slime/
- We believe in RL. We believe RL is the final piece toward AGI.
- If you feel the same way, you'll share our vision:
- Every field should be end-to-end RLed and every task should become an agent environment.
- Every RL run should last longer, and every model should scale larger.
- RL systems should integrate seamlessly with existing infrastructure, letting us focus on new ideas instead of boilerplate engineering.
- That's why we present slime, a post-training framework designed to be:
- Versatile – with a fully customizable rollout interface and flexible training setups (colocated or decoupled, synchronous or asynchronous, RL or SFT cold start).
- Performant - integrating SGLang for inference and Megatron-LM for training, natively.
- Maintainable - with a lightweight codebase and smooth transition from Megatron pretraining to SGLang deployment.
In short, a post-training framework for RL scaling.
The journey of RL scaling has just begun, and slime is continuously evolving. In the next phase, we will focus on:
1. Collaborating with the SGLang team to explore optimal RL training strategies for large-scale MoE models.
2. Supporting broader post-training workflows, strengthening the pre-training-to-production bridge. 投递
Z.ai

About Z.ai

Z.ai is the AI company behind the GLM series models, dedicated to inspiring the development of AGI to benefit humanity.

Industry
IT & Software
Company Size
51-200 employees
Headquarters
Beijing, CN
Year Founded
Unknown
Social Media