TikTok

AI Model Evaluation Project Lead - AI Data Service and Operations (Eco Governance)

TikTok  •  Singapore, SG (Hybrid)  •  2 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About the Team

The AI Data Service and Operations (ADSO) team provides safety and non-safety data annotation services, search operation services, and customer services for ByteDance's international products, helping them to build their own data ecological security. In order to optimize user experience while upholding negative content governance on our platforms, the Eco Governance team within ADSO focuses on data labeling work to support online content strategies and AI/LLM model development.

As the AI Model Evaluation Project Lead in the Eco Governance team, you will lead end-to-end delivery of AI data annotation projects and play a critical role in evaluating AI/LLM model performance. You will directly manage a team of AI Project Managers while driving rigorous model evaluation, analyzing results, identifying gaps in model behavior, and delivering clear, actionable recommendations to improve model quality. You will bridge annotation operations with model performance insights to ensure high-quality training/evaluation data translates into measurable improvements in AI capabilities.

Responsibilities

- Lead data annotation and model evaluation projects: Manage end-to-end execution of multiple projects, ensuring both annotation quality targets/SLAs and model performance benchmarks are met.

- Design and execute AI model evaluations: Develop or refine evaluation frameworks, create test cases/datasets (including adversarial/safety-focused ones), run evaluations on LLM outputs, and assess metrics such as accuracy, safety, relevance, bias, and robustness in content governance scenarios.

- Analyze model performance and provide recommendations: Deep-dive into evaluation results, perform root cause analysis on model failures or quality issues, identify patterns in errors, and translate findings into concrete recommendations for annotation guideline improvements, data collection strategies, model fine-tuning, or process changes.

- Serve as the primary stakeholder interface: Translate product, safety, and business needs into clear annotation + evaluation requirements; align on targets and success metrics; and present evaluation insights and recommendations to algorithm, product, and leadership teams.

- Drive delivery governance and cross-functional collaboration: Establish operating rhythms, conduct evaluation reviews, and build escalation frameworks across QA, vendors, annotation teams, and business stakeholders.

- Leverage data for performance management: Monitor dashboards for both annotation and model metrics, detect anomalies, conduct in-depth data and root cause analysis, and drive continuous improvements in quality and efficiency.

- Lead continuous improvement and optimization: Identify gaps between annotation quality and model performance; design workflow enhancements, hybrid (machine + human) labeling strategies, and automation opportunities; partner with tooling and algorithm teams to scale evaluation capabilities.

- Risk and change management: Proactively identify risks related to data quality, model safety, or delivery timelines; propose mitigation plans; and lead operational transitions.

- Deliver strategic reporting: Synthesize annotation and model evaluation data into clear insights, performance summaries, and forward-looking recommendations for leadership and cross-functional partners.
TikTok

About TikTok

Inspire Creativity and Bring Joy

Industry
Arts & Entertainment
Company Size
10,000+ employees
Headquarters
Los Angeles, California
Year Founded
Unknown
Social Media