Adaptyv

Site Reliability Engineer

Adaptyv  •  Lausanne, CH (Onsite)  •  2 hours ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Adaptyv is building an automated lab thats let AI agents run biology experiments.

We're entering the era of agentic science where AI models can now design novel proteins, propose hypotheses, and iterate on experimental results. But they can't run the experiments themselves - that's still a manual, months-long process. We're building the infrastructure that gives AI agents access to the physical world.

We are one of the fastest growing biotech companies, trusted by leading biopharmas, frontier AI labs, and the techbio companies pushing the field forward. This is a rare chance to help advance some of the most important work happening in biotech today.

Our automated lab is powered by a deep software + hardware stack: lab instruments worth millions of USD reverse-engineered into API-controllable hardware, dozens of devices orchestrated through complex workflows, full observability on everything that happens in the lab, processing pipelines for messy physical-world data, and AI systems that troubleshoot production results and accelerate assay development.

We’re growing rapidly and are hiring for talented people to scale and support the massive demand for AI-driven wet lab experimentation.

Adaptyv runs a physical lab through software. When our systems go down it isn't a page that fails to load — it's a liquid handler that stops mid-run, an instrument that loses its booking, or a customer's experiment that stalls with their protein already in a plate. Reliability here has physical-world consequences, and we need someone who owns it.

You'll be responsible for the health of the entire stack that keeps LabOS and our customer-facing platform running: the APIs, edge functions, databases, processing pipelines, job queues, and the integrations that connect our software to millions of dollars of lab hardware. You'll build the observability, alerting, and automation that let a small team run a 24/7 automated lab without living in firefighting mode — and when something does break, you're the person who makes sure it gets caught early, fixed fast, and never happens the same way twice.

In a given week, that might mean:

  • Building observability across the stack — metrics, logs, traces, and dashboards (Grafana) that make the state of the lab and the platform legible at a glance

  • Defining SLOs for the services that matter, instrumenting them, and setting up alerting that pages on real problems and stays quiet otherwise

  • Hardening our data and processing pipelines so messy physical-world data doesn't silently corrupt results or stall experiments

  • Owning incident response: triage, mitigation, and blameless postmortems that turn every outage into a permanent fix

  • Improving deploy safety and rollback across our services (Vercel, Supabase, Modal, edge functions) so shipping fast doesn't mean shipping fragile

  • Automating away toil — the manual recovery steps, the babysitting, the "just restart it" runbooks — so the lab runs itself as much as possible

  • Partnering with the software and lab-automation teams to make reliability a property of the system rather than an afterthought

What we're looking for

  • Strong systems and software engineering. You write production code (Python and/or TypeScript) and you're comfortable owning infrastructure, not just configuring it.

  • Real SRE / production ownership experience. You've run services that people depend on, carried a pager, and built the observability and automation that made on-call survivable.

  • Observability fluency. Metrics, logging, tracing, dashboards, alerting — you know how to make a complex distributed system legible, and you've used tools like Grafana / Prometheus / Loki (or equivalents) in anger.

  • Incident instinct. You stay calm when things break, find root cause fast, and you're allergic to the same incident happening twice.

  • Automation-first mindset. You'd rather spend a day automating a recurring 10-minute task than do it manually forever — and you build with coding agents like Claude Code as a default.

  • Pragmatic about reliability. You know the difference between what needs five nines and what doesn't, and you spend effort where it actually matters.

  • Bonus: where software meets the physical world. Hardware / lab / IoT, queues and pipelines, or cloud infra at scale — anything that has to keep running when there's something real on the other end.

  • Curious about biology. No background required, but you should find it genuinely interesting that the thing you're keeping alive is a lab where AI runs real experiments.

Application deadline

We are reviewing applicants on a rolling basis.

Adaptyv

About Adaptyv

Proteins are the most advanced nanotechnology we know of. At Adaptyv Bio we’re building a next-gen protein foundry to allow you to synthesize and test any protein you design.

Industry
Biotech & Life Sciences
Company Size
11-50 employees
Headquarters
Lausanne, CH
Year Founded
Unknown
Social Media