Job Description
Team Intro
As TikTok’s monetization ecosystem continues to grow across ads, e-commerce, short video, and live streaming, the demand for accurate, scalable, and efficient labeled data and content understanding systems is increasing rapidly. Our team is responsible for improving how commercial content is understood, labeled, structured, and operationalized at scale. We combine advanced machine learning with strong engineering execution to support data production, model innovation, and business impact.
About the role
We are looking for an experienced Machine Learning Engineer to join the Monetization Data Alignment team. This role sits at the intersection of AI labeling, content understanding, multimodal large models, and agentic systems. You will work on building the next generation of scalable intelligence capabilities that power TikTok monetization use cases, with a strong focus on high-quality data, multimodal reasoning, and production-grade ML systems.
In this role, you will contribute to both research and production: from exploring LLM/MLLM, Agent, and reinforcement learning techniques, to delivering robust systems for labeling automation, multimodal understanding, rule retrieval, agent development, and interpretable decision-making.
Responsibilities
- Design and develop machine learning solutions for AI labeling and content understanding in TikTok monetization scenarios, supporting ads, short video, and other monetization products.
- Apply and improve LLM/MLLM, NLP, CV, and multimodal learning techniques to enhance fine-grained understanding across text, image, audio, video, and live content.
- Build algorithms and systems for labeling automation, intent recognition, taxonomy/tag generation, rule retrieval, risk detection, and quality evaluation, improving both model accuracy and operational efficiency.
- Explore and productionize Agent and RL based approaches for complex decision-making workflows, including multi-step reasoning, tool use, and adaptive content analysis.
- Develop interpretable solutions such as CoT-style reasoning, explanation generation, and traceable decision logic to improve trustworthiness and operational usability of model outputs.