Job Description

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.
THE OPPORTUNITY
Join our new AGI safety product team and help transform complex AI research into practical tools that reduce risks from AI. As a Backend Engineer, you'll work closely with our CEO (also Head of Product), product engineers and Evals team software engineers to build products that make AI agent safety accessible at scale. We are building products that monitor AI coding agents for safety and security failures.
You will join a small team and will have significant ability to shape the team & tech, and have the ability to earn responsibility quickly. You will like this opportunity if you care about building products that genuinely make AI agents safe and thrive in high-paced environments as well as enjoy closely working with researchers.
KEY RESPONSIBILITIES
Infrastructure & Architecture - Design and implement scalable backend systems capable of processing and analyzing large volumes of AI agent logs in real-time - Build and maintain data processing pipelines that extract, transform, and store agent trajectory data efficiently - Architect database schemas and data models optimized for both high-throughput writes and complex analytical queries - Design for reliability, implementing robust error handling, retry logic, and graceful degradation strategies - Monitor system performance and optimize bottlenecks to ensure sub-second latency for critical monitoring operations
API Development - Develop secure, well-documented RESTful APIs that allow customers to integrate our monitoring product into their workflows - Implement authentication, authorization, and rate limiting to protect customer data and ensure fair resource usage - Build webhook systems and real-time notification services to alert customers about critical safety events - Design API interfaces that are intuitive for developers while remaining flexible for diverse customer use cases - Design and implement integrations with Security Information and Event Management (SIEM) systems, enabling customers to stream monitoring alerts and security events into their existing security operations workflows
Data Systems - Implement efficient storage solutions for both structured data (monitoring results, metadata) and unstructured data (agent logs, code outputs) - Build data processing systems that can handle everything from streaming real-time monitoring to batch analysis of historical data - Design and implement caching strategies to optimize frequent queries and reduce infrastructure costs - Create data retention and archival policies that balance customer needs with storage efficiency
Monitoring & Observability - Build comprehensive logging, metrics, and tracing systems to ensure visibility into system health and performance - Implement alerting systems that notify the team of infrastructure issues before they impact customers - Create dashboards and tools that help the team understand system behavior and diagnose issues quickly - Design systems that make debugging production issues straightforward and minimize time-to-resolution
Collaboration & Quality - Work closely with our researchers to understand their needs and translate research prototypes into production-ready systems - Collaborate with frontend engineers to design APIs and data structures that enable excellent user experiences - Participate in code reviews to maintain high standards for code quality, security, and performance - Document architectural decisions, API specifications, and system behaviors to facilitate knowledge sharing - Contribute to technical discussions about technology choices, trade-offs, and implementation approaches

JOB REQUIREMENTS

4+ years of experience building production backend systems at scale
Strong Python proficiency with experience in frameworks like FastAPI, Flask, or Django
Experience designing and implementing RESTful APIs with clear documentation
Solid understanding of database design and optimization (SQL and/or NoSQL)
Experience with cloud platforms (AWS, Google Cloud, or Azure) and containerization technologies (Docker, Kubernetes)
Experience building data-intensive applications or processing large-scale log data
Strong understanding of system design principles, including scalability, reliability, and security
Experience with asynchronous processing, message queues, and distributed systems
Demonstrated ability to write clean, well-tested, maintainable code

Bonus

Familiarity with real-time data processing frameworks (Kafka, Redis Streams, etc.)
Experience with ML/AI infrastructure or building tools for AI applications
Previous work on developer tools, monitoring systems, or security products
Experience with infrastructure-as-code (Terraform, CloudFormation, etc.)
Familiarity with AI safety concepts or evaluation frameworks like Inspect
Contributions to open-source backend infrastructure projects
Experience building security-centric products
Experience with code analysis platforms
Experience with Golang

We want to emphasize that people who feel they don't fulfill all of these characteristics but think they would be a good fit for the position nonetheless are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine.

REPRESENTATIVE PROJECT

Real-time agent monitoring infrastructure: Design and build the backend system that processes AI coding agent outputs in real-time to detect safety and security issues. Start by implementing a scalable ingestion pipeline that can accept agent logs via API, then build a processing system that routes logs through various monitors based on their characteristics. Implement a storage layer that efficiently handles both recent high-frequency queries and historical analysis. Add a notification system that alerts users when monitors detect concerning behaviors, with configurable thresholds and delivery methods. Throughout the project, ensure the system maintains sub-second p95 latency for critical operations while gracefully handling traffic spikes and partial system failures.

BENEFITS

This role offers market competitive salary, equity, and competitive benefits.
Salary: 100k - 180k GBP (~135k - 245k USD)
Flexible work hours and schedule
Unlimited vacation
Unlimited sick leave
Up to 6 months of paid parental leave
Comprehensive health, dental and vision insurance
Retirement savings with competitive employer matching (e.g. 401(k) for US employees)
Lunch, dinner, and snacks are provided for all employees on workdays
Paid work trips, including staff retreats, business trips, and relevant conferences
A yearly $1,000 (USD) professional development budget.

LOGISTICS

Time Allocation: Full-time
Location: This is an in-person role working out of our London or San Francisco office.
Visa sponsorship: We sponsor visas in both the UK and US. Sponsorship isn't guaranteed for every role or candidate, but if we make you an offer, we'll work with you to find the right visa route.

ABOUT THE TEAM
The product team is a new team. Especially early on, you will work closely with Marius Hobbhahn (CEO), Jeremy Neiman (product engineer) and Zak Walters (product engineer) on the product team. You’ll also sometimes work with our SWEs, Rusheb Shah, Andrei Matveiakin, Alex Kedrik, and Glen Rodgers to translate our internal tools into externally usable tools. Furthermore you will interact with our researchers, since we intend to be “our own customer” by using our products internally for our research work. You can find our full team here
ABOUT APOLLO
The rapid rise in AI capabilities offer tremendous opportunities, but also present significant risks. At Apollo Research, we’re primarily concerned with risks from Loss of Control, i.e. risks coming from the model itself rather than e.g. humans misusing the AI. We’re particularly concerned with deceptive alignment / scheming, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight. We work on the detection of scheming (e.g., building evaluations), the science of scheming (e.g., model organisms), and scheming mitigations (e.g., anti-scheming and control). We closely work with multiple frontier AI companies, e.g. to test their models before deployment or collaborate on scheming mitigations.
We’re now also developing tools and products that make it easier to prevent harms from AI systems widely deployed AI systems. We specifically target coding agent safety since coding agents are the most advanced agents and tasked with high-stakes decisions.
At Apollo, we aim for a culture that emphasizes truth-seeking, being goal-oriented, giving and receiving constructive feedback, and being friendly and helpful. If you’re interested in more details about what it’s like working at Apollo, you can find more information here
Equality Statement: Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.
HOW TO APPLY
Please complete the application form with your CV. The provision of a cover letter is neither required nor encouraged. Please also feel free to share links to relevant work samples.
About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 3 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no leetcode-style general coding interviews. If you want to prepare for the interviews, we suggest getting familiar with the evaluations framework Inspect, or by building simple monitors for coding agents and running them on your own Claude Code / Cursor / Codex / etc. traffic.
Your Privacy and Fairness in Our Recruitment Process: We are committed to protecting your data, ensuring fairness, and adhering to workplace fairness principles in our recruitment process. To enhance hiring efficiency, we use AI-powered tools to assist with tasks such as resume screening. These tools are designed and deployed in compliance with internationally recognized AI governance frameworks. Your personal data is handled securely and transparently. We adopt a human-centred approach: all resumes are screened by a human and final hiring decisions are made by our team. If you have questions about how your data is processed or wish to report concerns about fairness, please contact us at info@apolloresearch.ai.

About Apollo Research

Apollo Research is an AI safety organization. We specialize in auditing high-risk failure modes, particularly deceptive alignment, in large AI models. Our primary objective is to minimize catastrophic risks associated with advanced AI systems that may exhibit deceptive behavior, where misaligned models appear aligned in order to pursue their own objectives.

Our approach involves conducting fundamental research on interpretability and behavioral model evaluations, which we then use to audit real-world models. Ultimately, our goal is to leverage interpretability tools for model evaluations, as we believe that examining model internals in combination with behavioral evaluations offers stronger safety assurances compared to behavioral evaluations alone.

Industry

IT & Software

Company Size

11-50 employees

Headquarters

London, GB

Year Founded

2023

Website

apolloresearch.ai

Social Media