Rackner

MLOps Engineer — AI/ML Systems & Deployment (TS/SCI Preferred)

Rackner  •  Dayton, OH (Onsite)  •  10 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
65
AI Success™

Job Description

MLOps Engineer — AI/ML Systems & Deployment (TS/SCI Preferred)
Dayton, OH (On-site Preferred) | Remote Eligible (CAC-Ready Candidates)
Mission Environment | AI/ML Infrastructure | National Security Impact

About the Role

At Rackner, we are building the operational backbone that turns AI/ML capability into real-world mission outcomes. We are seeking an MLOps Engineer to own the lifecycle of AI/ML systems—from experimentation to deployment—within a mission-critical, classified environment supporting Air Force and NASIC-aligned programs.

This is not a research role; This is where models become reliable, deployable, auditable systems.

You will operate at the intersection of:

  • Machine learning
  • Distributed systems
  • Cloud-native infrastructure

…and ensure that AI/ML systems work in the environments where failure is not an option.

What You’ll Do

Own the ML Lifecycle (End-to-End)

  • Build and operate production-grade ML pipelines
  • Orchestrate workflows using Kubeflow, Airflow, or Argo
  • Implement model versioning, lineage, and reproducibility standards

Operationalize AI/ML Systems

  • Deploy models into mission environments (including constrained or classified systems)
  • Transition workflows from Jupyter experimentation → containerized pipelines → production systems
  • Enable both batch and real-time inference architectures

Engineer for Reliability, Not Just Performance

  • Design systems for reproducibility, auditability, and stability
  • Implement monitoring for:
    • model performance & drift
    • system health & latency
  • Use tools like Prometheus, Grafana, and OpenTelemetry

Build Cloud-Native ML Infrastructure

  • Deploy and manage Kubernetes-based ML workloads
  • Containerize pipelines using Docker / OCI standards
  • Scale compute for training and inference workloads

Establish Data Discipline

  • Enable data versioning and governance (lakeFS or similar)
  • Support feature engineering and dataset preparation pipelines
  • Apply metadata standards (e.g., STAC) where applicable

Create Repeatable Systems

  • Develop runbooks, playbooks, and deployment standards
  • Build systems that can be operated by others; not just understood by you

What You Bring

Core Experience

  • Experience deploying ML systems into production environments
  • Strong background in Python and ML frameworks (PyTorch, TensorFlow, etc.)
  • Hands-on experience with:
    • ML pipeline orchestration tools (Kubeflow, Airflow, Argo)
    • Experiment tracking (MLflow, ClearML)

Infrastructure & Systems

  • Experience with Kubernetes and containerized workloads
  • Familiarity with CI/CD for ML systems
  • Understanding of distributed systems and scalable architectures

ML Application Exposure

  • Experience working with:
    • LLMs or transformer-based models
    • computer vision systems (YOLO, Faster R-CNN)
  • Focus on deployment and integration, not pure research

Mindset

  • Systems thinker who values reliability over novelty
  • Comfortable operating in ambiguous, high-stakes environments
  • Able to translate experimental work into operational capability

Why This Role Matters (What You Get)

This role is a career accelerator for engineers who want to:

  • Move beyond experimentation
    • Own systems that actually get deployed and used
  • Operate at the systems level
    • Work across ML, infrastructure, and mission integration
  • Build in high-trust environments
    • Where correctness, auditability, and reliability matter
  • Develop rare, high-demand expertise
    • MLOps in constrained / classified environments is a differentiated skillset

Shape how AI is operationalized—not just built

Who We Are

Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector. We are an energetic, growing consultancy with a passion for solving big problems across industries.

We enable digital transformation through:

  • Distributed systems
  • DevSecOps
  • AI/ML
  • Cloud-native architecture

Our approach is cloud-first, cost-effective, and outcome-driven—focused on delivering real capability, not just code.

Benefits & Perks

  • 100% covered certifications & training aligned to your role
  • 401(k) with 100% match up to 6%
  • Highly competitive PTO
  • Comprehensive Medical, Dental, Vision coverage
  • Life Insurance + Short & Long-Term Disability
  • Home office & equipment plan
  • Industry-leading weekly pay schedule

Apply

If you’re an engineer who wants to move from building models → owning systems, we want to talk.

#MLOps #MachineLearning #Kubernetes #AIEngineering #CloudNative #DevSecOps #ArtificialIntelligence #DataEngineering #DefenseTech #NationalSecurity #AIInfrastructure #Hiring #TechCareers

Rackner

About Rackner

Rackner builds cutting-edge solutions that apply DevSecOps and the power of AI in the datacenter, public and private clouds, and edge, leveraging the future of compute capability and technologies like Kubernetes (k8s) and WebAssembly (WASM). We're a member of the Cloud Native Computing Foundation and a Kubernetes Certified Service Provider - as well as a partner to the major public cloud companies.

Our customers include hypergrowth startups and federal agencies, both Civilian and Defense.

Core Competencies

- DevSecOps

- Edge Computing

- AI/ML

- Cloud-Native and Hybrid-Cloud development

- Web and Mobile Applications Development (Microservices)

Industry
IT & Software
Company Size
11-50 employees
Headquarters
Silver Spring, Maryland
Year Founded
2015
Social Media