The Next Chapter

Solutions Architect - AI / ML - Training & GPU infra

The Next Chapter  •  Amsterdam, NL (Remote)  •  19 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

AI/ML Solutions Architect – Distributed Training & GPU Infrastructure

Company

Join a fast-moving AI infrastructure team working on the cutting edge of large-scale ML workloads. This role is ideal for engineers who enjoy solving deep technical challenges in distributed training, multi-GPU systems, and scalable AI inference infrastructure. You will work directly with AI-focused clients, helping them get the most out of modern GPUs (H100, B200, etc.) and ML frameworks such as PyTorch (and JAX in some environments).

Team & Responsibilities

Work alongside senior AI and infrastructure engineers building large-scale GPU platforms. As part of the customer solutions team, you will:

  • Design and validate production-grade distributed training (primary) and large-scale inference architectures on large GPU clusters, typically tens to thousands of GPUs

  • Work hands-on with customers to debug, optimize, and scale ML workloads across multi-node GPU environments

  • Act as a technical authority on GPU performance, networking, and schedulers, making trade-offs at scale and translating customer needs into concrete platform requirements

  • Collaborate closely with engineering, product, and R&D to influence roadmap decisions based on real-world ML workloads

  • This is a hands-on, technical role; you are expected to work directly in customer environments, not only advise at a high level

Required skills and experience

  • Hands-on experience designing and operating enterprise-scale, production-grade, multi-node GPU workloads for training (7B+ model size) or inference

  • Strong background in distributed deep learning (PyTorch Distributed, DeepSpeed, ...) on GPU clusters

  • Deep understanding of GPU architecture and interconnects (H100/A100 class, NVLink, InfiniBand)

  • Experience with Kubernetes or Slurm

  • Experience with performance tuning using GPU profiling and monitoring tools

This role is not a fit if your experience is limited to single-node training, high-level AI strategy, or non-production research environments. We are looking for engineers and architects who thrive at the intersection of AI workloads and large-scale infrastructure.

What's offered

Location: Remote from anywhere in Europe

Total compensation up to EUR 250k (base + variable / OTE), depending on level and experience

The Next Chapter

About The Next Chapter

Dutch-based recruitment agency with a focus on IT, (High) Tech, Science & start/scale-ups. We specialize in connecting with English language technical talent on all levels and a Bsc/Msc/PhD background.

We offer flexibility in pricing and services, tailored to your specific needs: contingency based ("No Hire, No Pay") or for a small retainer and lowered successfee. Another option is our RaaS concept (Recruiter as a Service), whereas we will be your dedicated in-house recruiter. Sourcing, job marketing, selection and process management with optimal candidate experience in mind. Please don't hesitate to reach out to us for more details.

IT & technology recruitment voor (Engelstalige) professionals op HBO/WO niveau, zowel nationaal als internationaal. We bieden onze services als W&S ("No hire, No pay") maar bieden meerdere service- en pricing modellen, afgestemd op je specifieke behoeften. Een alternatief is Recruiter as a Service, waarbij we als volwaardig corporate recruiter rechtstreeks werven. Searchen, sourcen, procesmanagement en alles met een optimale "candidate experience" voor ogen. Neem contact met ons op voor meer informatie.

On our website you will find current job openings as well as useful information, for example about work permit / visa rules for The Netherlands.

Industry
HR & Recruiting
Company Size
1-10 employees
Headquarters
Den Bosch, NL
Year Founded
2021
Social Media