XPENG

大模型平台 & Infra 工程师

XPENG  •  Onsite  •  2 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

大模型平台 & Infra 工程师深圳、北京、上海全职智能机器人板块职位描述负责大模型训练、推理和评测的基础设施研发,为算法团队提供高效稳定的工程底座。
1、训练系统:设计和优化大规模分布式训练架构(Pretrain/SFT/RL),解决千卡级训练的通信、调度、容错问题;
2、推理部署:基于 vLLM 等框架优化大模型推理性能,支撑 VLT/Omni 等模型在 XP5 端侧和云端的部署;
3、评测平台:开发 DeepInsight 评测系统,支持 LLM/VLM/WBC/VLA 多类模型的自动化评测、报告生成和 CI/CD 集成;
4、MLOps 工具链:构建模型版本管理、实验追踪、数据管理、资源调度等基础设施,提升研发效率;
5、RL 训练环境:构建分布式强化学习训练系统,支持 Agent-环境大规模并行交互。职位要求1、本科及以上学历,计算机、软件工程等相关专业;
2、 精通 Python,熟练掌握 C++/Go 至少一门;
3、在以下至少一个方向有 2 年以上经验:
- 分布式训练系统(Megatron-LM/DeepSpeed/FSDP);
- GPU 编程与高性能计算(CUDA/NCCL/RDMA);
- ML 平台开发(Kubernetes/Ray/Airflow);
- 模型推理优化(TensorRT/vLLM/量化部署);
4、理解大模型训练和 RL 训练的基本流程。
【加分项】
- 有千卡级分布式训练系统的设计和运维经验;
- 熟悉 PyTorch 框架内部实现;
- 有 LLM/VLM 推理优化的实战经验;
- 有机器人系统或具身智能平台的开发经验。 投递
XPENG

About XPENG

XPeng is a leading Chinese Smart EV company that designs, develops, manufactures, and markets Smart EVs that appeal to the large and growing base of technology-savvy middle-class consumers. Its mission is to drive Smart EV transformation with technology and data, shaping the mobility experience of the future. In order to optimize its customers’ mobility experience, XPeng develops in-house its full-stack advanced driver-assistance system technology and in-car intelligent operating system, as well as core vehicle systems including powertrain and the electrical/electronic architecture. XPeng is headquartered in Guangzhou, China. In 2021, the Company established its European headquarters in Amsterdam, along with other dedicated offices in Copenhagen, Munich, Oslo, and Stockholm.The Company’s Smart EVs are mainly manufactured at its plant in Zhaoqing and Guangzhou,Guangdong province.

For more information, please visit https://heyxpeng.com.

Industry
Automotive & Mobility
Company Size
1,001-5,000 employees
Headquarters
Guangzhou, CN
Year Founded
2014
Social Media