Salvo Software

AI Developer

Salvo Software  •  Bengaluru, IN (Hybrid)  •  5 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About Salvo Software

Salvo Software is a global firm that provides cost-effective software solutions to guide enterprises and startups through digital transformation. With distributed teams across the US, LATAM, and India, we partner with clients to build high-performance, scalable systems that solve complex technical challenges. Our culture values innovation, ownership, and engineering excellence.

We are seeking a highly skilled AI Developer with a strong backend and machine learning engineering background to design, train, optimize, and deploy LLM models in on-prem and offline environments. This role is deeply technical and hands-on, requiring expertise across Python ML stacks, model optimization, local inference frameworks, RAG (Retrieval-Augmented Generation) architectures, MCP (Model Context Protocol) integrations, and DevOps workflows tailored for offline systems.

You will work closely with our engineering and product teams to build end-to-end LLM pipelines — including data preprocessing, supervised fine-tuning, model quantization, evaluation, RAG pipeline design, and deployment using local or air-gapped infrastructure. If you enjoy working with cutting-edge open-source LLMs, building context-aware AI systems, and designing reliable backend pipelines, this role is for you.

Key Responsibilities

Core LLM Development

  • Train and fine-tune LLMs using supervised fine-tuning (SFT).
  • Work with open-source models such as LLaMA, Mistral, Qwen, and similar architectures.
  • Build LoRA / Q-LoRA pipelines for efficient fine-tuning.
  • Implement and optimize data preprocessing workflows, including tokenization and long-context handling.
  • Use and extend Hugging Face Transformers & Datasets for training and inference.
  • Parse and process structured and semi-structured data, including XML/XSD files.
  • Implement document parsing solutions for Office formats (python-docx, OpenXML).

RAG & Context-Aware Systems

  • Design and implement end-to-end Retrieval-Augmented Generation (RAG) pipelines for document-grounded question answering and knowledge retrieval.
  • Build and maintain vector stores and embedding pipelines using tools such as FAISS, Chroma, Weaviate, or pgvector.
  • Optimize retrieval strategies including hybrid search, re-ranking, and chunking approaches tailored for domain-specific corpora.
  • Develop and maintain MCP (Model Context Protocol) server integrations to enable LLMs to interact dynamically with tools, APIs, and external data sources.
  • Design agentic workflows that leverage MCP to give models structured access to internal systems and context in a controlled, auditable manner.

Offline / On-Prem Model Expertise

  • Deploy, run, and maintain models fully offline and in air-gapped environments.
  • Perform model optimization and quantization (GGUF, GPTQ, AWQ, bitsandbytes).
  • Build and maintain inference systems using frameworks like vLLM, TGI, and Ollama.
  • Optimize GPU usage (CUDA, cuDNN, VRAM-aware batching).
  • Maintain local CI/CD pipelines for ML models without cloud dependencies.
  • Manage local model registries, versioning, and artifacts.
  • Ensure RAG and MCP components are fully operational in offline and restricted network environments.

Backend & DevOps

  • Build backend services in Python for ML training and inference workflows.
  • Work with relational databases (Postgres/MySQL) and vector databases for RAG storage layers.
  • Use Docker and Git for reliable development and deployment pipelines.
  • Use Azure DevOps for CI/CD, including local runners when applicable.

Requirements

Technical Skills

  • Strong experience in Python for backend and ML development.
  • Expertise with ML frameworks such as PyTorch or TensorFlow, scikit-learn, and pandas.
  • Solid knowledge of Postgres or MySQL for data storage.
  • Experience with Docker, Git, and DevOps best practices.
  • Hands-on expertise with LLM training, fine-tuning, and optimization.
  • Experience with Hugging Face Transformers & Datasets.
  • Familiarity with XML/XSD and Office document parsing tools.
  • Experience deploying models with vLLM, TGI, or Ollama.
  • Understanding of quantization techniques (GGUF/GPTQ/AWQ).
  • Experience working with GPU optimization and the CUDA stack.
  • Ability to build solutions for offline, on-prem, and air-gapped environments.
  • Hands-on experience designing and implementing RAG pipelines, including embedding models, vector stores (FAISS, Chroma, Weaviate, or pgvector), and retrieval optimization strategies.
  • Experience building or integrating MCP (Model Context Protocol) servers to connect LLMs with external tools, APIs, and structured data sources.

Nice to Have

  • Experience building agentic systems using MCP in production or near-production environments.
  • Familiarity with advanced RAG techniques such as HyDE, re-ranking, or multi-hop retrieval.
  • Experience managing ML model registries in offline environments.
  • Familiarity with AWS for hybrid deployments.
  • Experience with secure environments, restricted networks, or enterprise compliance requirements.

Soft Skills

  • Strong ownership mindset and problem-solving ability.
  • Ability to work effectively in distributed teams across time zones.
  • Clear communication when discussing complex technical topics with both technical and non-technical stakeholders.
Salvo Software

About Salvo Software

We design custom-built solutions to help you transform, scale, and grow your business along with a team that cares about you.

Salvo software is a global firm with near-shoring capabilities headquartered in Vancouver, WA. That provides cost-effective software solutions to guide enterprises and startups through digital transformation.

We help our partners to improve their client’s customer experience and optimize their business process times by providing hand-selected teams of experts that meet their needs and help them to make smart decisions.

Industry
IT & Software
Company Size
11-50 employees
Headquarters
VANCOUVER, WA
Year Founded
2017
Social Media