Modal

Member of Technical Staff - ML Performance

Modal  •  $150k - $270k/yr  •  New York City, NY (Onsite)  •  5 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About Us:

Modal provides the infrastructure foundation for AI teams. With instant GPU access, sub-second container startups, and native storage, Modal makes it simple to train models, run batch jobs, and serve low-latency inference. Companies like Suno, Lovable, and Substack rely on Modal to move from prototype to production without the burden of managing infrastructure.

We're a fast-growing team based out of NYC, SF, and Stockholm. We've hit high 8-figure ARR and recently raised a Series B at a $1.1B valuation. We have thousands of customers who rely on us for production AI workloads, including Lovable, Scale AI, Substack, and Suno.

Working at Modal means joining one of the fastest-growing AI infrastructure organizations at an early stage, with many opportunities to grow within the company. Our team includes creators of popular open-source projects (e.g. Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience.

The Role:

We are looking for strong engineers with experience in making ML systems performant at scale. If you are interested in contributing to open-source projects and Modal’s container runtime to push language and diffusion models towards higher throughput and lower latency, we’d love to hear from you!

Requirements:

  • 5+ years of experience writing high-quality, high-performance code.

  • Experience working with torch, high-level ML frameworks, and inference engines (vLLM or TensorRT).

  • Familiarity with Nvidia GPU architecture and CUDA.

  • Experience with ML performance engineering (tell us a story about boosting GPU performance — debugging SM occupancy issues, rewriting an algorithm to be compute-bound, eliminating host overhead, etc).

  • Nice-to-have: familiarity with low-level operating system foundations (Linux kernel, file systems, containers, etc).

  • Ability to work in-person, in our NYC, San Francisco or Stockholm office.

Modal

About Modal

Deploy generative AI models, large-scale batch jobs, job queues, and more on Modal's platform. We help data science and machine learning teams accelerate development, reduce costs, and effortlessly scale workloads across thousands of CPUs and GPUs.

Our pay-per-use model ensures you're billed only for actual compute time, down to the CPU cycle. No more wasted resources or idle costs—just efficient, scalable computing power when you need it.

Industry
IT & Software
Company Size
51-200 employees
Headquarters
New York City, New York
Year Founded
Unknown
Website
modal.com
Social Media