Job Description
About the team
Seed Global Data is a team focused on producing international data for LLMs. For the training of large models, data is the lifeline of model quality — and the Global Data team is working closely with technical, product, and operations teams to ensure effective data production strategies and execution management.
As a key member of our LLM Global Data Team, the LLM Training Operations Analyst will play a pivotal role in managing the intricate processes involved in training large language models (LLMs) with diverse coding datasets. This role focuses on overseeing and improving operational workflows, primarily for safety-related projects, ensuring they are delivered with high quality and efficiency.
Job Responsibilities
- Driving complex, fast-paced, cross-functional projects from incubation to execution. You will be responsible for designing and managing multiple Large Language Model (LLM) training projects (mostly coding-based but may involve other STEM related projects).
- Coordinating across functions (including product managers, engineers, and internal or external content experts), planning workflows, tracking progress, identifying risks and taking necessary corrective actions to ensure high-quality, timely project delivery.
- Working closely with your leads, product managers and engineers to design, test, and optimize operational workflows including model training strategies, quality assurance processes and productivity enhancements.
- Analyzing operational and model training or performance data to provide actionable insights through reports and presentations to stakeholders, driving future model training directions or adjustments.
- Designing and implementing robust data analysis strategies to systematically evaluate the quality of training and validation sets.
- Leading or supporting cross-domain operational improvement initiatives to optimize processes, share transferrable learnings and scale the generation of high-quality data.