
We are looking for a Senior SRE Engineer to drive the design, implementation, and evolution of our Kubernetes-based platform in a multi-cloud environment (GCP/AWS). At Finom, SREs are not just executors of tasks; you are the architects of reliability.
This role requires strong ownership of reliability, scalability, and platform architecture for high-load, mission-critical systems operating 24/7.
Lead the Platform Evolution: Design and operate our Kubernetes ecosystem (GKE, multi-cluster) with a focus on high availability and zero-downtime operations.
Build "Paved Roads": Own and evolve our PaaS strategy, using GitOps (ArgoCD) and CI/CD (GitLab) to empower domain teams to deploy independently.
Architect Reliability: Define and implement our observability strategy across metrics, logs, and tracing (Prometheus, VictoriaMetrics, OpenTelemetry).
Drive Infrastructure-as-Code: Lead the automation of our infrastructure using Terraform, ensuring all resources are standardized and version-controlled.
Own the Error Budget: Partner with engineering teams to establish and manage SLOs, SLAs, and incident management frameworks.
Disaster Recovery Mastery: Design and participate in regular DR drills, implementing blue/green and active/passive strategies across regions to ensure service continuity.
Innovate Operations: Proactively apply AI-driven approaches to improve operational efficiency and automated bottleneck detection.
Production K8s Mastery: Strong hands-on experience managing Kubernetes (GKE preferred) in high-load, multi-cluster production environments.
Cloud Infrastructure: Deep experience with GCP (AWS is a strong plus) and Terraform for large-scale infrastructure.
GitOps Expertise: Solid experience with ArgoCD, GitLab CI, and the "Infrastructure as Code" philosophy.
Observability Expert: Deep knowledge of the Prometheus/Grafana stack and implementing tracing/logging at scale.
System Design: Proven ability to design highly available 24/7 systems with automated failover and rollback capabilities.
English Fluency: English level B2+ for effective cross-functional communication.
Compliance Knowledge: Understanding of banking-grade standards like PCI DSS, GDPR, or ISO 27001
Distributed Systems: Experience with Kafka (Confluent), RabbitMQ, or managing high-load Redis and PostgreSQL clusters.
AI for Ops: Experience using AI tools to improve alerting, anomaly detection, or engineering efficiency.
Security-Minded: Experience with Vault for secret management and credential rotation.
Primary Cloud: GCP (~90%)
Orchestration & Deploy: GKE, ArgoCD, GitLab CI
Automation: Terraform
Data & Messaging: PostgreSQL, Kafka, Redis, RabbitMQ
Observability: Prometheus, Grafana, VictoriaMetrics, OpenTelemetry, Cloud Logging
Security: Vault

Finom is a European tech startup headquartered in Amsterdam—and we’re on a journey towards revolutionizing the financial landscape for entrepreneurs worldwide. Our mission is to develop an all-in-one financial B2B solution that integrates banking functions, accounting, financial management, and invoicing into a seamless, mobile-first platform.
Over the past two years, our team has fueled exponential growth, securing $50 million in investments and propelling us into hyper-growth mode. We’re on track to become a unicorn startup by 2025, backed by global funds like General Catalyst (known for supporting Airbnb, HubSpot, KAYAK, and Stripe). Finom has expanded its reach across 10+ EU countries, with a strong presence in key markets like Germany and France.
At Finom, we’re not just redefining the entrepreneurial experience—we’re empowering our employees to make a real difference. Your work matters, and your impact extends far beyond product metrics. We nurture innovation and an inspiring work environment where bold ideas thrive, prioritizing thorough research, swift implementation of solutions, and ensuring that every effort we make benefits our users, employees, partners, and our business as a whole.