Job Description
Team Introduction
Every time someone uploads, or watches a short video on TikTok, our team is working behind the scenes to make it happen — smoothly, instantly, and reliably. The Short Video Reliability team is where software engineering meets site reliability, and we run the world’s largest short video posting and delivery systems.
We’re not just keeping the lights on. We’re building systems that adapt to anything: a sudden viral trend, a massive global event, a data center switch, or a rare disaster scenario. If something big happens on TikTok, chances are we’ve prepared for it.
We are looking for talented individuals to join our team in 2026. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at TikTok
Successful candidates must be able to commit to an onboarding date by end of year 2026. Please state your availability and graduation date clearly in your resume.
Responsibilities:
- Design and build systems that adjust on the fly to infrastructure changes, data center moves, and global events
- Create smart traffic management that can handle viral video surges without breaking a sweat
- Build tools to spot and fix issues before they reach users
- Keep TikTok running smoothly across continents and time zones
- Develop systems for mapping, capacity planning, disaster recovery, and incident automation
- Test our systems with chaos engineering so they can handle anything thrown at them
- Use A/B testing to measure real-world improvements in stability, performance, and user experience