Guardian Life

AI Platform Engineer

Guardian Life  •  Chennai, IN (Onsite)  •  4 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
62
AI Success™

Job Description

AI Platform Engineer (3–5 years)

We are looking for an AI Platform Engineer to build and scale the core platform that supports traditional ML models as well as modern LLM and generative AI workloads. This role focuses on production-grade MLOps, model lifecycle management, platform reliability, security, and self-service enablement for data scientists and engineering teams.

Key Responsibilities

1. Platform Engineering

· Design and operate a scalable AI/ML platform using Kubernetes, containers, and infra-as-code.

· Build reusable frameworks for model training, fine-tuning, batch and real-time inference, and RAG pipelines.

· Implement multi-tenant isolation, quotas, and cost-tracking.

2. MLOps & Model Lifecycle Automation

· Develop CI/CD/CT pipelines for models, prompts, and data.

· Manage model registry, feature store, lineage, and experiment tracking.

· Ensure reliable production rollout using blue-green, canary, and shadow deployments.

3. Data & Pipelines

· Build scalable data and model pipelines using orchestrators like Airflow, Prefect, Dagster, or Argo.

· Implement data validation and schema enforcement.

· Optimize storage, caching, indexes, embeddings, and vector search workflows.

4. Observability & Reliability

· Set up monitoring for data drift, model drift, prompt performance, latency, accuracy, and cost.

· Define SLOs, SLIs, and incident response patterns.

· Implement logging, tracing and metrics using Prometheus, Grafana, OpenTelemetry, or similar tools.

5. Security & Governance

· Enforce secrets management, IAM controls, network security, and auditability.

· Implement model governance, model cards, prompt controls, and risk guardrails.

· Work with security to ensure PII and compliance adherence.

6. Performance & Cost Optimization

· Optimize compute, autoscaling, GPU usage, caching, and batching.

· Track cost per model, per workload, and per team for transparency.

· Implement model optimization (quantization, distillation, caching).

7. Enablement & Developer Experience

· Create templates, SDKs, CLI tools, documentation, and best practices.

· Help data scientists and developers move models to production quickly.

· Partner with architecture, cybersecurity, and product teams.

Must-Have Skills

· 3–5 years of experience, with 2+ years in ML platform/MLOps.

· Strong Python development skills.

· Experience with Kubernetes, Docker, Helm.

· Infra-as-code: Terraform, Pulumi, CloudFormation or similar.

· CI/CD systems like GitHub Actions, GitLab CI, Azure DevOps, Jenkins.

· Experience with one or more ML platforms:

o MLflow, Kubeflow, Azure ML, Vertex AI, SageMaker, Ray, BentoML, W&B.

· Strong understanding of model lifecycle, deployment patterns, and monitoring.

· Experience with vector databases, feature stores, artifact registries.

· Familiarity with observability stacks (Prometheus, Grafana, Loki, OpenTelemetry).

· Strong understanding of security for data and ML workloads.

Good-to-Have Skills

· Experience with LLM serving frameworks: vLLM, Triton, Ray Serve, OpenAI/Anthropic APIs.

· Experience building RAG systems with vector DBs: FAISS, Milvus, Pinecone.

· Understanding of data engineering tools like Spark, Flink, Kafka.

· GPU optimization (CUDA, TensorRT, ONNX).

· Background in cost governance (FinOps for AI).

· Experience building internal SDKs, CLIs, or developer tools.

· Knowledge of privacy frameworks and governance models.

Education

B.E / B.Tech / M.E / M.Tech in Computer Science, IT, or equivalent hands-on experience.

Location:

This position can be based in any of the following locations:

Chennai

Current Guardian Colleagues: Please apply through the internal Jobs Hub in Workday

Guardian Life

About Guardian Life

Who we are

Guardian makes a difference in the lives of people when they need us most. With over 160 years of stability and fiscal integrity, we are a trusted resource to generations of families and business owners, inspiring well-being and helping build financial confidence.

Today, we stand behind 29 million consumers, helping them prepare and plan for a bright future for themselves and their families. We help business owners care for their employees. And we help people recover and thrive in times of unexpected loss.

As a modern mutual insurance company, we believe in driving value beyond dividends. We invest in our colleagues, are building an inclusive and innovative culture, and are helping to uplift communities through thoughtful corporate impact programs.

What we stand for

In 1860, a community of immigrants joined together to insure and protect their businesses and families. They were guided by powerful ideals that we’ve continued to stand behind and evolved throughout the years: we do the right thing, we believe people count, we courageously shape the future together, and we go above and beyond for the people we serve.

Guardian employees embrace and live by these values every day. They remind us to put people at the heart of all we do so that we can help protect what matters most to you. Want to help bring these values to life? Join us for a rewarding career and the opportunity to shape the future.

Disclosures:

Financial information concerning Guardian as of December 31, 2022, on a statutory basis: Admitted assets = $76.0 billion; liabilities = $67.2 billion (including $55.0 billion of reserves); and surplus = $8.8 billion. Dividends are not guaranteed. They are declared annually by Guardian’s Board of Directors.

Guardian® is a registered trademark of The Guardian Life Insurance Company of America. © Copyright 2023 The Guardian Life Insurance Company of America 2023-156184 Exp. 5/25

Industry
Finance & Insurance
Company Size
5,001-10,000 employees
Headquarters
New York , NY
Year Founded
1860
Social Media