Job Description
We are seeking an experienced Research Scientist or Engineer to help define and build the next generation of AI infrastructure. In this role, you will work at the intersection of large-scale systems, AI, and emerging hardware to design infrastructure that enables reliable, efficient, and scalable AI workloads at ByteDance.
You will work closely with tech leaders, architects, and product teams to translate evolving AI requirements into robust infrastructure architectures. The role involves identifying emerging trends in AI algorithms and systems, designing scalable system architectures, and driving innovations that improve performance, reliability, and cost efficiency across the AI stack.
Responsibilities
AI Infrastructure Architecture
Design and evaluate scalable infrastructure architectures for large-scale ML workloads across compute, storage, and networking. Develop technical proposals and specifications that guide next-generation AI infrastructure systems.
Research & Technology Exploration
Track emerging trends in AI systems, distributed computing, and hardware acceleration. Conduct technical investigations and prototypes, and share insights through technical reports and presentations.
Performance & System Optimization
Analyze and optimize performance across the ML infrastructure stack—including scheduling, networking, storage, and training frameworks—through benchmarking, experimentation, and bottleneck analysis.
Cross-Team Technical Alignment
Work across research and engineering teams to translate AI workload requirements into scalable infrastructure solutions, providing architectural guidance and driving cross-team technical initiatives.
The base salary range for this position in the selected city is $212800 - $387600 annually.