Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.
Role
The AI Kernel Engineer in Quadric plays the key role to enable a large number of AI kernels/operators to run efficiently on the Quadric platform. The AI Kernel Engineer at Quadric will [1] develop a highly efficient Quadric kernel library for a variety of AI/LLM models; [2] analyze the performance and optimize the kernel for different hardware configurations; This senior technical role demands deep knowledge of hardware architecture, compiler toolchain and optimization techniques.
Responsibilities
Requirements
Benefits
Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.
Quadric is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, religion, sex, national origin, sexual orientation, age, citizenship, marital status, or disability.

Quadric licenses an AI processor architecture optimized for on-device inference. Only the Quadric Chimera GPNPU (general purpose neural processing unit) delivers high AI/ML inference performance while also running C++ code without forcing the developer to artificially partition application code between two or three different processors. Quadric's Chimera GPNPU processor IP core scales from 1 to 864 TOPS and seamlessly intermixes scalar, vector and matrix code.