Job Description

Software Engineer (AI Platforms)

About SingleStore

At SingleStore, we’re not just building a database company, we’re defining the future of data management. Going beyond multi-cloud, we offer customers flexible networking, storage, and compute options to meet their requirements. With a few clicks, our cloud service spins up production grade infrastructure using the latest capabilities of major cloud providers and the industry standard Kubernetes ecosystem.

As data systems evolve, the “database” is no longer just where queries run, it’s becoming the foundation for realtime AI applications: retrieval, reasoning, agent workflows, and intelligent automation over enterprise data. That’s the direction we’re building toward.

About the AI Platform Team

We build the software platform that powers AI native experiences across SingleStore: AI/ML capabilities, agent runtimes, tool integration, and the operational layer required to run these systems reliably at scale. Our work sits at the intersection of distributed systems, cloud infrastructure, and practical applied AI.

This team is not “pure research”, it’s engineering heavy. You’ll build product grade systems that let customers safely and reliably use AI on their data.

We are looking for a Software Engineer to design and implement core platform capabilities for AI/ML and AI Agents in SingleStore Cloud. You’ll work on services that enable model/tool orchestration (e.g. MCP style tool discovery and execution), agent workflows, retrieval pipelines (embeddings/vector search), evaluation/observability, and secure multi tenant operations.

You will likely find yourself using Go and Python, Kubernetes, cloud primitives, and the right tools for the job, while applying solid AI/ML fundamentals to make correct engineering decisions.

Role and Responsibilities

Build and evolve backend services that power AI features: agent orchestration, tool execution, retrieval/RAG pipelines, and model serving integrations.
Design APIs and control plane workflows for AI platform components (tenant-aware, secure by default, observable).
Implement MCP style tool discovery / integration patterns so agents can safely call tools, connectors, and internal services.
Work closely with product managers, designers, customers, and partner engineering teams to deliver high quality AI experiences.
Engineer for reliability and scale: latency, cost controls, rate limiting, fallbacks, rollouts, and incident response readiness.
Establish best practices around evaluation: offline test sets, regression detection, prompt/model/version tracking, and quality gates.
Contribute to secure AI by design approaches: permissions, data access boundaries, prompt injection defenses, and auditability.
Mentor junior engineers and contribute to a welcoming, high ownership team environment.

Required Skills and Experience

This is a software engineering role that requires strong fundamentals plus working knowledge of AI/ML concepts.

Strong software engineering skills with experience in distributed systems (Go, Python, or similar).
Experience building cloud native services Kubernetes, containers, service-to-service APIs, CI/CD.
4+ years of experience working on a SaaS product or production platform.
Solid understanding of AI/ML fundamentals (you don’t need to be a researcher, but you should understand concepts well enough to build correct systems):

Supervised learning basics (training vs inference, overfitting, evaluation metrics, classification, anomaly detection, forecasting, regression etc.)
LLM basics (tokens, context windows, prompting, tool/function calling concepts)
Embeddings + vector search fundamentals (similarity, indexing tradeoffs, retrieval pitfalls)

Strong debugging and problem-solving skills, including incident-style troubleshooting across services and infrastructure.
Intellectual curiosity about investigating issues that impact product quality, reliability, latency, and business metrics.
Passion for building robust, maintainable systems in a fast-paced, team-oriented environment.

Nice to Have (Preferred)

Hands on experience with AI agents and orchestration frameworks (tool calling, workflows, planners/executors).
Practical experience with RAG systems, reranking, grounding, and evaluation strategies.
Experience with model serving patterns (batch/online inference, caching, streaming responses).
Knowledge of security considerations for AI systems (data isolation, RBAC, prompt injection threats, audit logs).
Familiarity with vector databases or vector capabilities in modern data platforms.
Experience with observability stacks (structured logging, metrics, tracing) and SLO driven engineering.

Tech Stack

Go, Python, Kubernetes, cloud infrastructure, distributed systems, APIs, and modern AI tooling (LLM providers, embeddings, retrieval systems, eval/observability pipelines), ML tooling.

About SingleStore

The core of all AI, business intelligence and applications is data — various bits and bytes that come in all different formats. Only when we sift through this data, reason with it and build on top of it in real time does it give way to vast amounts of information and knowledge.

Real-time insights are key to the way we live our lives today; the way we entertain ourselves; the way we listen to music; the way we order groceries. Real-time insights keep your BI tools fresh; they keep your ride-sharing app with the most current price; and they ensure you never miss a fraudulent payment.

SingleStoreDB is the world’s only database that empowers users to transact, analyze and search data in real time. It empowers the world’s makers to build, deploy and scale modern, intelligent applications — backed by streaming data ingestion, a unique table type that supports both transactional (OLTP) and analytical (OLAP) workloads, limitless point-in-time recovery and a distributed (shared-nothing), MySQL-compatible architecture.

Industry

IT & Software

Company Size

501-1,000 employees

Headquarters

San Francisco, California

Year Founded

2011

Website

singlestore.com

Social Media