Job Description

大模型部署工程师深圳、北京、上海全职智能机器人板块职位描述负责将大语言模型、多模态模型和具身智能模型高效部署到机器人端侧芯片和云端，实现低延迟实时推理。
1、负责 VLT（任务规划模型）、Omni（多模态交互模型）、VLA（操作模型）等大模型在 XP5 芯片上的端侧部署，完成模型量化（INT8/INT4/FP8）、图优化和推理加速；
2、设计和优化云端模型推理服务（基于 vLLM/TensorRT-LLM），支撑 VLT 云端推理的高并发低延迟需求；
3、开发运动控制模型（ONNX）在实时系统中的高性能推理管道，满足 500Hz 控制频率要求；
4、建立模型部署的标准化流程：模型转换→量化→性能基准测试→端侧验证→上线发布；
5、与算法团队协作，从模型设计阶段介入，提供部署可行性评估和性能预估。职位要求1、本科及以上学历，计算机、电子工程等相关专业；
2、精通 C++/Python，具备 2 年以上模型部署或推理优化经验；
3、熟悉至少一种推理框架：TensorRT / ONNX Runtime / MNN / TVM / vLLM；
4. 有以下至少一项深入经验：
- 模型量化（PTQ/QAT/混合精度）与精度-速度 trade-off 调优；
- CUDA 编程与 GPU kernel 优化；
- 嵌入式 NPU 部署（高通/联发科/NVIDIA Orin）；
5、理解 Transformer 架构和主流大模型（LLaMA/Qwen/ViT）的计算特性。
加分项：
- 有 LLM 推理服务的生产环境运维经验（vLLM/TGI/Triton）；
- 有端侧大模型部署经验（手机/车载/机器人）；
- 熟悉 KV-cache 优化、PagedAttention、投机解码等 LLM 推理加速技术；
- 有多模型协同推理的调度优化经验（多个模型共享 NPU/GPU 资源）。投递

About XPENG

XPeng is a leading Chinese Smart EV company that designs, develops, manufactures, and markets Smart EVs that appeal to the large and growing base of technology-savvy middle-class consumers. Its mission is to drive Smart EV transformation with technology and data, shaping the mobility experience of the future. In order to optimize its customers’ mobility experience, XPeng develops in-house its full-stack advanced driver-assistance system technology and in-car intelligent operating system, as well as core vehicle systems including powertrain and the electrical/electronic architecture. XPeng is headquartered in Guangzhou, China. In 2021, the Company established its European headquarters in Amsterdam, along with other dedicated offices in Copenhagen, Munich, Oslo, and Stockholm.The Company’s Smart EVs are mainly manufactured at its plant in Zhaoqing and Guangzhou，Guangdong province.

For more information, please visit https://heyxpeng.com.

Industry

Automotive & Mobility

Company Size

1,001-5,000 employees

Headquarters

Guangzhou, CN

Year Founded

2014

Website

xiaopeng.com

Social Media