Job Description
The Trust & Safety (T&S) GenAI & Emerging Product team's mission is to empower the development of GenAI models and applications. We do this by building a world-class safety, testing, and risk management system that ensures GenAI innovations are launched responsibly.
The AI Content Red Team sits within the T&S GenAI and Emerging Products pillar. The team is responsible for conducting unstructured adversarial testing of generative AI products and models to uncover emerging risks, alongside our structured evaluations.
This team combines attacker-minded testing, risk discovery, and clear operational feedback loops to inform product decisions, policy development, mitigations, and longer-term evaluation strategy. We probe models and product experiences across modalities, use cases, and abuse patterns to identify failure modes, stress-test safeguards, and help teams improve safety before and after launch.
We work closely with Trust & Safety teams (policy, product, engineering, data science, operations), and business teams across global markets. Success in this team requires strong judgment, creativity, analytical rigor, and the ability to translate ambiguous findings into actionable recommendations.
Responsibilities:
- Lead the global AI Content Red Team, including team strategy, hiring, people management, and capability building.
- Design and oversee end-to-end adversarial testing programs for GenAI products.
- Establish testing methodologies to simulate adversarial behaviors, edge cases, and real-world abuse scenarios.
- Drive discovery of emerging content risks, abuse patterns, and safety strategy failure modes across text, image, video, audio, agentic modalities.
- Develop mechanisms to track themes, trends, and escalate high-severity findings that may impact user safety, compliance, or brand risk. Ensure testing outputs are actionable and well-documented.
- Partner with Product, Policy, and business teams to ensure red-teaming insights directly inform product design, safety mitigations, and policy frameworks.
- Represent team in cross-functional forums, and serve as a key risk voice in product readiness discussions