MiniMax

高性能通信专家

MiniMax  •  Onsite  •  8 hours ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

高性能通信专家
北京、上海
社招
全职
互联网 / 电子 / 网游 - 研发
大模型系统
职位描述
1. 从通信视角,参与到模型设计和大规模推理系统的设计构建中,尝试逼近硬件理论极限。
2. 关注超大集群的构建问题,从原理层面解决大规模集群下出现的高性能通信问题。
3. 负责大模型训练/推理场景下的集合通信性能优化(NCCL/自研通信库)。
4. 参与 RoCE、InfiniBand 网络架构设计与调优,解决大规模集群通信瓶颈。
5. 分析并优化计算-通信 overlap 策略,提升端到端训练/推理吞吐。
职位要求
1. 精通 RDMA 网络技术(RoCE v2 / InfiniBand),有大规模 GPU 集群通信调优经验。
2. 熟悉 NCCL 原理及源码,有定制或替代集合通信库的经验优先。
3. 了解主流并行策略(TP/PP/DP/CP)中的通信模式与性能特征。
4. 熟悉 GPU 架构及 NVLink/NVSwitch/PCIe 拓扑对通信的影响。
5. 有网络拥塞控制(DCQCN/TIMELY)、流量调度或交换机侧调优经验者优先。
6. 参与过大规模规模训练集群或推理集群通信优化者优先。
7. 对主流LLM 推理系统(如 vLLM、SGLang、TRT-LLM)的通信路径有认知。
投递
MiniMax

About MiniMax

MiniMax is a leading global technology company and one of the pioneers of large language models (LLMs) in Asia. Our mission is to build a world where intelligence thrives with everyone.

MiniMax develops proprietary LLMs across various modalities, including a trillion-parameter MoE model, a speech model with low latency and native support for major Asian languages, and a state-of-the-art text-to-speech and text-to-video models. Experience it now at https://hailuoai.com/

Leveraging these multi-modality general-purpose models, the MiniMax API Platform offers enterprises and developers secure, flexible, and reliable API services, enabling the rapid deployment of AI applications.

Industry
IT & Software
Company Size
51-200 employees
Headquarters
Singapore, SG
Year Founded
2022
Social Media