Job Description

THE OPPORTUNITY

We are building Watcher, a coding agent security product. We are looking for a security & control expert to help us design better threat models and control protocols against AI adversaries, and improve the effectiveness and security of Watcher.

A security & control expert embedded in the product team with three functions:

Research: Supporting Apollo's monitoring research with threat modeling, attack design, red-teaming, and trajectory analysis (~50%),
Expert view: Bring a security expert’s view of what security buyers need to make Watcher a better product (~25%),
Product security: Improve the security posture of Watcher and Apollo's product infrastructure (~25%).

This is truly a “start-up role” in the sense that you have many responsibilities and have significant say in shaping the direction of the role. This is an individual contributor role but could lead to management responsibilities eventually, if desired.

KEY RESPONSIBILITIES

Security support for research (50%)

Threat modeling for coding agents

Develop threat models for coding agents under the assumption that they could be misaligned or compromised. This includes near-term threats like AI agents leaking private internal data and long-term threats like agents sabotaging safety research.

Treat coding agents through the lens of insider risk: they have credentials, access to code, network access, and the ability to execute arbitrary actions, just like a malicious insider would (see Control agenda).

Map out kill chains and attack progressions similar to frameworks like MITRE ATT&CK, adapted for agentic AI. See e.g. the Agentic Loss-of-Control Threat Matrix for an example of a high-quality contribution.

Build and maintain our “coding agent security levels” which define what level of robustness Watcher provides against different categories of failure modes. These levels should be concrete, testable, and usable both internally (to guide product priorities) and externally (to communicate our security posture to customers).

Failure mode library and literature

Maintain our library of coding agent failure modes and keep it comprehensive, accurate, and current.

Stay up to date with the academic and industry literature on coding agent security, AI security, and adjacent fields (prompt injection, deception, accidental security implications, supply chain attacks on AI systems, etc.).

Attack design for monitor development

Design realistic attack trajectories and example exploits for monitor development and backtesting, work with the research team to ensure monitors are trained and evaluated against credible adversarial behavior.

Trajectory adjudication

Serve as security adjudicator for flagged trajectories. When monitors flag ambiguous agent behavior, determine whether it represents a genuine attack, a benign-but-unusual workflow, or needs deeper investigation, and feed those judgments back into monitor improvement.

Red-teaming Watcher

Red-team and improve Watcher's monitors and policies, document findings, and feed them into product improvement.

Work with the technical research staff to build adversarial test suites into the backtesting pipeline.

Security expertise for product (25%)

Failure mode prioritization and CISO perspective

Understand which failure modes are the biggest problems and prioritise systematically what is most useful to solve for security buyers.

Customer engagement and landscape awareness

Join customer/prospect calls to understand security needs firsthand and map them onto the threat models and library of failure modes.

Maintain awareness of how enterprises currently secure coding agents (or don't), what tools they use (SIEM, DLP, CSPM), and how Watcher fits into their existing stack.

Customer-facing security artifacts

Own security questionnaire responses, product security whitepapers, and support for customer pen-tests.

Own technical content of customer-facing communications during incidents.

Improving security posture for product (25%)

Attack surface, infrastructure, and isolation

Own AppSec standards for the product engineering team (code review security checklists, dependency scanning, secrets management in CI/CD, container hardening)

Own vendor security decisions for product: which SaaS tools can we integrate and what restrictions should we have.

Reduce attack surface for all product-related deployments, infra and cloud deployment work, and ensure clean isolation between the Apollo Product team and other teams at Apollo to prevent lateral movement either way.

Data handling and multi-tenancy

Define security requirements for tenant isolation, encryption at rest/in transit, access controls, and data retention policies for customer coding agent transcripts with the product engineering lead.

Product threat model and incident response

Co-own the product-specific threat model with the product team lead (distinct from the org-wide threat model owned by existing security engineers) and keep it up to date.

REPRESENTATIVE PROJECTS

Develop a comprehensive coding agent threat model: Think deeply about all the ways in which coding agents could attack an organization. Compare this threat model to conventional threats from human insiders. Publish a detailed research piece describing the threat model building on existing research, e.g. from Redwood Research.

Improve our database of failure modes: We have an internal database of 50+ failure modes of coding agents with detailed reports for all of them. For this project, you would provide an expert view on the current state of that database and suggest improvements. In the long run, you would maintain that database and be responsible for integration of new failures.

Prioritize failure modes that Watcher should cover: Different parts of Watcher attempt to cover different threat models and attack strategies. Based on the results of the threat model project above, we want to ensure that each part of Watcher covers the most important failure modes in the most efficient way. For example, not all monitors require blocking capability and some failure modes might benefit from additional affordances like being able to disperse subagents.

Design a security roadmap for keeping customer data for Watcher secure: Watcher runs on top of coding agents which means that it accesses highly sensitive information. We need to ensure that this data is stored in a secure manner and the attack surface created by Watcher is as small as possible.

JOB REQUIREMENTS

Must-haves

5+ years in security roles in a hands-on technical capacity (not purely GRC/compliance). You'd need to be able to think structurally about threat modeling and failure modes. You need to be able to read code, understand infrastructure, and evaluate technical controls, not just write policies.

Engineering mindset: You treat security as an engineering problem. You are capable of, and willing to, build custom solutions when the task demands it, rather than relying solely on 'glueing' together off-the-shelf tools. You prioritize automation and systems-level thinking to scale security, and you are comfortable leveraging AI to accelerate development.

Direct experience with application security, cloud security, or product security. Ideally has owned or significantly contributed to the security posture of a product that handles sensitive customer data.

Strong understanding of enterprise security buyer concerns. Ideally has been a security engineer or senior security leader at an organization that buys security tooling, or has worked closely enough with those buyers to internalize their priorities.

Comfortable working in a small, fast-moving team where they'll need to context-switch between hands-on security work and strategic product thinking within the same week.

Strong written communication. This role produces a lot of artifacts (threat models, security levels, questionnaire responses, failure mode documentation) and they need to be clear and precise.

Strong nice-to-haves

Experience with AI/ML systems security, LLM security, or AI control research. The field is young enough that deep experience here is rare, but any exposure significantly reduces ramp-up time.

Detection engineering, SOC, or incident analysis experience. A part of this role is judging whether flagged agent behavior is genuinely malicious, and people who have triaged real-world alerts might ramp much faster.

Familiarity with insider threat programs or insider risk frameworks. The mental model of "the coding agent is a potentially malicious insider" is useful for this role and someone who has worked on insider threats will pick it up faster.

Experience in a security vendor or security product company. Building security products is different from consuming them, and someone who has done both will bridge the gap between "what CISOs want" and "what we can actually build" more effectively.

Red teaming or offensive security background. Useful for the Watcher red-teaming responsibilities and for thinking adversarially about failure modes.

Formal AI safety research background. Helpful but not necessary. We need security practitioners who can learn the AI safety context, not AI safety researchers who need to learn security.

Explicitly not required

Management experience. This is an IC role, at least initially.

Specific certifications (CISSP, etc.). We care about demonstrated ability, not credentials.

BENEFITS

This role offers market competitive salary, equity, and competitive benefits.

Salary: 135k - 200k GBP (~180k - 270k USD)

Flexible work hours and schedule

Unlimited vacation

Unlimited sick leave

Up to 6 months of paid parental leave

Comprehensive health, dental and vision insurance

Retirement savings with competitive employer matching (e.g. 401(k) for US employees)

Lunch, dinner, and snacks are provided for all employees on workdays

Paid work trips, including staff retreats, business trips, and relevant conferences

A yearly $1,000 (USD) professional development budget

Relocation support and visa fees (if applicable)

LOGISTICS

Time Allocation Full-time

Location This is an in-person role working out of our London or San Francisco office. We offer flexible working hours and some wfh arrangements.

Visa sponsorship: We sponsor visas in both the UK and US. Sponsorship isn't guaranteed for every role or candidate, but if we make you an offer, we'll work with you to find the right visa route.

ABOUT THE TEAM

The Product / Control team is a new team. Especially early on, you will work closely with Marius Hobbhahn (CEO & currently leads the monitoring team), Victor Gillioz (Research Scientist), Monika Jotautaitė (Research Scientist), and our product engineers: Jeremy Neiman, Zak Walters, Zen van Riel, and Srdjan Miletic Furthermore you will interact with our other SWEs and researchers, since we intend to be “our own customer” by using our products internally for our research work. You can find our full team here

ABOUT APOLLO RESEARCH

The rapid rise in AI capabilities offer tremendous opportunities, but also present significant risks. At Apollo Research, we’re primarily concerned with risks from Loss of Control, i.e. risks coming from the model itself rather than e.g. humans misusing the AI. We’re particularly concerned with deceptive alignment / scheming, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight. We work on the detection of scheming (e.g., building evaluations), the science of scheming (e.g., model organisms), and scheming mitigations (e.g., anti-scheming and control). We closely work with multiple frontier AI companies, e.g. to test their models before deployment or collaborate on scheming mitigations. At Apollo, we aim for a culture that emphasizes truth-seeking, being goal-oriented, giving and receiving constructive feedback, and being friendly and helpful. If you’re interested in more details about what it’s like working at Apollo, you can find more information here

We're now also developing tools and products (see Watcher) that make it easier to prevent harms from AI systems widely deployed AI systems.

Equality Statement Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.

HOW TO APPLY

Please complete the application form with your CV. The provision of a cover letter is optional. Please also feel free to share links to relevant work samples.

About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 2 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no leetcode-style general coding interviews. If you want to prepare for the interviews, we suggest building simple monitors for coding agents and running them on your own Claude Code / Cursor / Codex / etc. traffic.

Your Privacy and Fairness in Our Recruitment Process: We are committed to protecting your data, ensuring fairness, and adhering to workplace fairness principles in our recruitment process. To enhance hiring efficiency, we use AI-powered tools to assist with tasks such as resume screening. These tools are designed and deployed in compliance with internationally recognized AI governance frameworks. Your personal data is handled securely and transparently. We adopt a human-centred approach: all resumes are screened by a human and final hiring decisions are made by our team. If you have questions about how your data is processed or wish to report concerns about fairness, please contact us at info@apolloresearch.ai.

About Apollo Research

Apollo Research is an AI safety organization. We specialize in auditing high-risk failure modes, particularly deceptive alignment, in large AI models. Our primary objective is to minimize catastrophic risks associated with advanced AI systems that may exhibit deceptive behavior, where misaligned models appear aligned in order to pursue their own objectives.

Our approach involves conducting fundamental research on interpretability and behavioral model evaluations, which we then use to audit real-world models. Ultimately, our goal is to leverage interpretability tools for model evaluations, as we believe that examining model internals in combination with behavioral evaluations offers stronger safety assurances compared to behavioral evaluations alone.

Industry

IT & Software

Company Size

11-50 employees

Headquarters

London, GB

Year Founded

2023

Website

apolloresearch.ai

Social Media

AI Security & Control Engineer