TikTok

Research Scientist - Data and State Acceleration - Global Frontier Tech Recruitment Program - 2027 Start (PhD)

TikTok  •  San Jose, CA (Onsite)  •  1 month ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
58
AI Success™

Job Description

We are looking for talented individuals to join our team in 2027. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at our Company.

Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume.

Team Introduction: Our Arch-Data Ecosystem team plays a crucial role in the data ecosystem of the TikTok Recommendation System, focusing on creating offline and real-time data storage solutions for large-scale recommendation, search, and advertising businesses, serving over 1 billion users. The core goals of the team are to ensure high system reliability, uninterrupted service, and smooth data processing. We are committed to building a storage and computing infrastructure that can adapt to various data sources and meet diverse storage requirements, ultimately providing efficient, cost-effective, and user-friendly data storage and management tools for the business.

Topic Content: Building a unified infrastructure that integrates the "training data base" and "training/inference state system" for multimodal foundation models in search, recommendation, and advertising scenarios. Through collaborative optimization of data lakes, caching, distributed computing, and GPU IO, we aim to reduce training and inference costs for foundation models while improving iteration efficiency.

Responsibilities:

- Design and implement real-time and offline data architecture for large-scale recommendation systems.

- Build scalable and high-performance streaming Lakehouse systems that power feature pipelines, model training, and real-time inference.

- Collaborate with ML platform teams to support PyTorch-based model training workflows and design efficient data formats and access patterns for large-scale samples and features.

- Own core components of our distributed storage and processing stack, from file format to stream compaction to metadata management.
TikTok

About TikTok

Inspire Creativity and Bring Joy

Industry
Arts & Entertainment
Company Size
10,000+ employees
Headquarters
Los Angeles, California
Year Founded
Unknown
Social Media