Tracer

Founding Lead Engineer, Agentic SRE

Tracer  •  £80k - £125k/yr  •  London, GB (Onsite)  •  4 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About the job

  • Do you get excited by tackling engineering challenges that others deem impossible?

  • Do you want to build AI agents that investigate alerts in production workflows

  • Do you like working in a team with only the absolute best of the best?

If your answers are yes, then you should keep reading.

🚀 About The Role

Tracer is building agentic alert investigation for production data pipelines.

Teams already have alerts. Tracer investigates pipeline incidents before they page your team, filtering noise, correlating evidence across your stack, and producing an evidence-based RCA for the issues that actually matter.

We’re hiring a Founding Lead Engineer in London to own core architecture and ship an agent that produces grounded RCAs (and fix suggestions) for a set of high-value alerts. Humans stay in control of production decisions.

​​

💻 You’ll love our tech stack

  • Python + Langraph (for multi-agentic alert investigation)

  • Rust (because we like systems that are fast and correct)

  • ClickHouse (high-volume event + investigation history at scale)

  • AWS + Terraform (infrastructure that builds itself)

  • Next.js + TypeScript (because front-end should be sexy too)

💼 Key Responsibilities

You’ll own the core systems that turn an alert into a defensible investigation and RCA. In practice, you will:

  • Architect and build the core alert, investigation, root cause analysis (RCA) pipeline in Python

  • Design and implement key systems including:

    • Alert ingestion + normalization

    • Context enrichment + correlation

    • Problem framing outputs

    • Hypothesis orchestration engine

    • Investigation execution runtime

    • Investigation artifacts + reporting

  • Drive core architecture decisions and ensure the system is observable, auditable, and reliable from day one

  • Partner with founders to ship a small set of high-value alert types that work extremely well, then expand coverage deliberately

  • Build customer-ready integrations across the pipeline stack

  • Educate and guide future engineers, setting a high bar for technical quality, speed, and pragmatism

🌟 What We Are Looking For

  • 5+ years (ideally 10+) professional software engineering experience.

  • Proven track record of shipping real products at high velocity

  • Strong backend and distributed-systems foundations, ideally with experience in data platforms and production pipeline stacks and incident/observability tooling.

  • Experience working at an early-stage startup and bonus points for having joined earlier.

  • High ownership and sharp product instincts: you build what matters and cut what doesn’t.

💸 Compensation

We will be transparent and competitive.

  • Salary: £ 80.000 - £125.000

  • Equity: Determined on a case by case basis depending on skill and experience level (0.3%-1%)

  • Visa sponsorship: Yes

  • Location: London

⚒️ Our Recruitment Process

  • Introductory Call (15-30 mins): Call with our hiring manager to discuss your background, motivations, and learn more about Tracer

  • Role Fit Interview (45 mins): Meet with your manager or a similar-level team member to review your working style, skills, and fit for the role

  • Take-home & Competency Deep Dive (1 hour): Complete a practical exercise (e.g., case study, presentation, or technical problem-solving) to explore the role's responsibilities and expectations

  • On-site meetup (Half Day): On-site interviews and team lunch at our headquarters to ask any questions and experience our office and culture firsthand

  • Offer: Final decision and offer

Tracer

About Tracer

Tracer is the first pipeline monitoring system purpose-built for high-compute workloads that lives in the OS. Tailored towards biotech and pharma.

Traditional tools lack the depth, breadth, and clarity required to keep up with the explosive workload growth. Leaving teams with blind spots, runaway costs, and endless debugging.

Tracer closes this gap by:


--> Capturing OS-level signals across every node

--> Reconstructing complete workflows across machines

--> Pinpointing root causes instantly without endless log-hunting

We integrate seamlessly into your existing infrastructure with a single click and combine this data to optimise pipelines, debug faster, and attribute real-time costs.

Result: scientists spend less time stuck in logs and more time advancing discovery.

Industry
IT & Software
Company Size
11-50 employees
Headquarters
Unknown
Year Founded
Unknown
Social Media