Codvo.ai

Head of AI Evaluation & Reliability Engineering

Codvo.ai  •  Pune, IN (Hybrid)  •  1 month ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Head of AI Evaluation & Reliability EngineeringLocation: Flexible / Hybrid
Reports To: Head of Engineering

Role Mission
Build and scale Codvo’s AI Evaluation & Reliability Engineering capability as a core engineering function supporting the design, validation, and continuous improvement of enterprise AI systems in production.You will architect the frameworks, tooling, benchmark assets, and operational processes required to ensure AI systems deployed by Codvo and its customers meet enterprise standards for reliability, safety, performance, and governance.This role is deeply embedded within engineering and serves as the quality and reliability backbone for Codvo’s AI platform and delivery organization.
Why This Role Matters
As AI systems move from pilots to business-critical workflows, reliability and evaluation become core engineering disciplines—not optional afterthoughts.Codvo is building the infrastructure and operational rigor required to ensure every AI deployment is measurable, governed, and production-ready.
Core ResponsibilitiesEngineering Ownership
- Build Codvo’s AI Evaluation & Reliability Engineering function as a core platform/engineering capability.
- Define engineering standards for AI evaluation, testing, release gating, and runtime monitoring.
- Integrate evaluation/reliability frameworks into Codvo’s engineering and delivery lifecycle.
Evaluation Architecture
- Design reusable evaluation frameworks for:
  - LLM / multimodal quality
  - RAG grounding / evidence fidelity
  - Agent reasoning / decision quality
  - Tool / workflow execution success
  - Safety / policy / compliance adherence
  - Cost / latency / production economics
Benchmark Infrastructure
- Build benchmark packs, golden datasets, and regression suites for priority enterprise workflows.
- Define benchmark coverage and versioning standards.
- Establish processes for edge-case capture and benchmark expansion.
Runtime Reliability Systems
- Design systems/processes for:
  - Runtime drift / degradation monitoring
  - Failure mode analysis / incident diagnostics
  - Human review / escalation pathways
  - Continuous evaluation and improvement loops
Technical Leadership
- Partner closely with platform, product, and solution engineering teams.
- Serve as internal SME on AI reliability, benchmark design, and evaluation methodology.
- Help shape architecture standards for AI-native product and workflow delivery.Team Leadership
- Build and lead a team of:
  - Evaluation Engineers
  - Benchmark / QA Engineers
  - Reliability / Observability Engineers
  - Domain Review / Feedback Ops Specialists
Required Qualifications
- 10+ years in engineering / AI / ML leadership roles.
- 5+ years building or operating production AI / ML systems.
- Proven experience designing or operating:
  - AI/LLM evaluation frameworks
  - Benchmark / regression systems
  - AI QA / testing / validation infrastructure
  - Production ML / observability / monitoring systems
  - Reliability engineering / quality engineering organizations
Technical Expertise
- LLM / multimodal evaluation methodologies
- Benchmark / golden dataset design
- Agent / tool-use / workflow evaluation
- RAG evaluation / grounding analysis
- AI observability / telemetry / tracing
- Human-in-the-loop feedback systems
- AI safety / governance / policy testing
- Release gating / CI/CD / engineering quality systems
Preferred Backgrounds
- AI Infrastructure / Evaluation Platforms
- AI Observability / MLOps Companies
- Enterprise AI Platform Teams
- Applied AI Product / Platform Organizations
- Reliability / QA Engineering Leadership in Complex Systems
Success Metrics
- Establish Codvo-wide AI evaluation/reliability standards
- Integrate evaluation frameworks into engineering lifecycle
- Launch reusable benchmark packs for target workflows
- Reduce AI production failure / exception rates across deployments
- Improve release confidence and deployment velocity for AI systems
- Increase benchmark/evaluation asset reuse across customers
Ideal Candidate Profile
- Systems/reliability engineer mindset with strong AI depth
- Product-minded builder who can create reusable engineering frameworks
- Obsessed with operational excellence and measurable quality
- Comfortable driving standards across engineering organizations


Note- Please apply via our official careers portal only, as applications sent directly to executives may not be considered.
Codvo.ai

About Codvo.ai

At Codvo.ai, we specialize in leveraging artificial intelligence, cloud, and data to solve complex business problems and drive innovation. Our passion for innovation drives us to deliver solutions that not only meet but exceed your unique business needs, fostering smarter, more productive teams. Here’s why our approach has earned widespread acclaim from our clients:

67 Customer NPS: Our Net Promoter Score is a testament to the high level of satisfaction and loyalty among our clients. It underscores our ability to deliver quality and value through our specialized services, making us a preferred partner for businesses looking to leverage AI and data for competitive advantage.

78 Employee NPS: The satisfaction and engagement of our team directly influence the quality of service we provide. Our high employee NPS signifies a motivated, dedicated team that's committed to excellence. This positive work culture ensures that we can deliver exceptional AI-first engineering and enterprise data application services to you.

Our approach goes beyond traditional software development; we're dedicated to partnering with you to harness the power of AI and data. The combination of our high trial and engagement success rates, extensive experience, and positive feedback from both clients and employees positions us as more than just a service provider. We're your trusted ally in navigating the complexities of today's digital landscape, committed to transforming your vision into a reality with cutting-edge AI and data solutions.

Industry
IT & Software
Company Size
51-200 employees
Headquarters
Plano, Texas
Year Founded
2019
Website
codvo.ai
Social Media