BRAHMA AI

Senior DevOps Engineer

BRAHMA AI  •  United Kingdom of Great Britain and Northern Ireland (Hybrid)  •  1 month ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description


Own build, deploy, and runtime reliability across BRAHMA AI’s hybrid estate. Deliver secure, scalable infrastructure for Gen AI based workflows and products across hybrid environments. Partner with infrastructure and multidisciplinary product and research teams to help them innovate and ship fast.We are hiring remotely across the EMEA region.

Key Responsibilities

  • Design, implement, and operate Slurm and Kubernetes-based platforms across cloud and on-prem GPU nodes, including autoscaling, rollout strategies, and multi-cluster operations.
  • Build CI/CD pipelines for services, model training, and model serving; standardise artifact/version management and environment promotion.
  • Implement Infrastructure as Code with Terraform/Terragrunt and configuration management; enforce drift detection and repeatable environments.
  • Design and implement observability stacks (metrics, logs, tracing); drive incident response and postmortems.
  • Secure the stack with least privilege, secrets management, network policy, and hardened baselines; support ISO/MPA controls with the security team.
  • Operate model-serving infrastructure for real-time and batch workloads; optimise GPU utilisation, concurrency, and latency.
  • Drive cost visibility and efficiency across compute, storage, and egress; forecast capacity
  • and plan lifecycle of hardware and licenses.

Must Haves

  • 6+ years in DevOps/SRE/Platform roles running production systems.
  • Expert with Kubernetes and containers (runtime, scheduling, networking, autoscaling).
  • Strong with Terraform and at least one configuration management tool (Ansible
  • preferred).
  • CI/CD (GitHub Actions [preferred] / GitLab), release strategies, and artifact registries.
  • Observability in production (Prometheus/Grafana preferred).
  • Linux mastery, shell scripting, and a high-level language (Python preferred).
  • Cloud proficiency (AWS/GCP/Azure) and Security fundamentals: IAM, secrets management, network segmentation, image provenance.
  • Experience with data-/media-heavy workloads or ML pipelines in production.
  • Location: EU/UK time zones (±2h).

Nice to Have

  • Model serving stacks and GPU telemetry/optimization.
  • On-prem operations for GPU/CPU fleets.
  • HPC/VFX pipeline exposure; render farms; real-time engines.
  • Storage systems (S3/MinIO, Ceph/Lustre/NFS), CDN, and caching strategies.
  • Messaging/streaming (Kafka) and workflow/orchestration (Argo, Airflow).

About You

  • Pragmatic and systems-thinking oriented.
  • Bias to automate and simplify.
  • Clear communication during incidents and reviews.
  • Ownership across design, operations, and quality.
BRAHMA AI

About BRAHMA AI

BRAHMA AI is a leading Enterprise AI Content Creation Platform. Through its technology, creativity, and Mind² Philosophy, BRAHMA AI is re-imagining the world of Content Creation, Management, and Distribution by multiplying the Human Mind and the AI one, always keeping the Human as the driver and the AI as the Amplifier.

Industry
Unknown
Company Size
201-500 employees
Headquarters
Unknown
Year Founded
Unknown
Website
brahma.io
Social Media