NVIDIA

Engineering Manager, LLM Performance

NVIDIA  •  $224k - $431k/yr  •  Santa Clara, CA (Hybrid)  •  1 hour ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

At NVIDIA, wearen'tjust powering the AI revolution—we'reaccelerating it.We are accelerating LLM inference across the stackand across allopen sourceLLM frameworks like TensorRT LLM,vLLMandSGLangWith demand for AI exploding, particularly in the realm of large language models (LLMs) and vision language models (VLMs, VLAs), we are significantly expanding our team.

We'reseekinga highly skilled and driven Engineering Manager to take the lead inacceleratingthe next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. This is a high-impact, hands-on leadership role at the intersection of deep technicalexpertiseand world-class management. Youwon'tjust manage;you'llarchitect and guide a brilliant team of engineers who arepushing the performance ofLLM inference. Your work will be highly collaborative, interfacing directly with NVIDIA Researchers, GPU Architects, and other teams across the company to ensure we ship production-grade, lightning-fast software that sets the global standard for AI performance.

WhatYou’llBe Doing:

  • Lead and grow a team responsible forpushing the performance of LLM inference across multiple LLM frameworks, including TensorRT LLM,vLLM,SGLangand Dynamoon our datacenter products

  • Drive the design,implementationand optimization of features that are key to performance in LLM inference.

  • Continuously improve the performance of LLM inference on current and upcoming NVIDIA datacenter architectures and GPUs.

  • Continuously improvethe performance of LLM inference ofimportant foundation models

  • Work with inference benchmark teams to helptune performance for key workloads.

  • Integratingcutting-edgetechnologies developed at NVIDIA and offering an intuitive developer experience for LLM deployment.

  • Lead software development execution, with responsibility for project planning, milestone delivery, and cross-functional coordination.

What We Need to See:

  • MS, PhD, or equivalent experience in Computer Science, Computer Engineering, AI, ora relatedtechnical field.

  • 7+ overall years of overall software engineering experience, including 3+ years of technical leadership experience.

  • Proven ability to lead and scale high-performing engineering teams, especially across distributed and cross-functional groups.

  • Strong background in C++ or Python, withexpertisein software design and delivering production-quality software libraries.

  • Demonstratedexpertisein large language models (LLM) and/or vision language models (VLM)and/or inference in general

Ways to Stand Out from the Crowd:

  • Deep understanding of GPU architecture, CUDA programming, and system-level performance tuning.

  • Background in LLM inference or working with frameworks such as TensorRT-LLM,vLLM, orSGLang

  • Passion for building scalable, user-friendly APIs and enabling developers in the AI ecosystem.

  • Have a proventrack recordof growing and managing a team that encourages idea sharing, empowers team members, and provides opportunities for professional growth.

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 3, and 272,000 USD - 431,250 USD for Level 4.

You will also be eligible for equity and benefits

Applications for this job will be accepted at least until June 27, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA

About NVIDIA

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Industry
Hardware & Semiconductors
Company Size
10,000+ employees
Headquarters
Santa Clara, CA
Year Founded
1993
Social Media