About Us:
Positron.ai specializes in developing custom hardware systems to accelerate AI inference. These inference systems offer significant performance and efficiency gains over traditional GPU-based systems, delivering advantages in both performance per dollar and performance per watt. Positron exists to create the world's best AI inference systems.
Senior Software Engineer – Machine Learning Systems & High-Performance LLM Inference
We are seeking a Senior Software Engineer to contribute to the development of high-performance software that powers execution of open-source large language models (LLMs) on our custom appliance This appliance leverages a combination of FPGAs and x86 CPUs to accelerate transformer-based models The software stack is written primarily in modern C++ (C++17/20) and heavily relies on templates, SIMD optimizations, and efficient parallel computing techniques

Positron delivers vendor freedom and faster inference for both enterprises and research teams, by allowing them to use hardware and software explicitly designed from the ground up for generative and large language models (LLMs).
Through lower power usage and drastically lower total cost of ownership (TCO), Positron enables you to run popular open source LLMs to serve multiple users at high token rates and long context lengths. Positron is also designing its own ASIC to expand from inference and fine tuning to also support training and other parallel compute workloads.