AGENTIC OPS-(CREQ253453)
MODEL VALIDATOR
Eval set development - ability to benchmark agent performance through “reasoning paths”
Adversarial testing - ability to break agent by giving it conflicting instructions etc
Scholastic regression testing - Measure variance in agent behavior.
Tool call validations - Agent calls the correct external APIs and databases
Ability to review thought chains and identify where agent logic diverged from BRD
Must have knowledge of applying judge LLMs to grade outputs
Python and framework - Proficiency in DeepEval, Langsmith etc
Ability to do semantic debugging - Look at agent’s “thought trace”
Screening Criteria
SDETs as they have coding background to testing - They can develop evals
Good knowledge of data / SQL based testing etc
Domain background is added advantage for such roles
IN-AP-Hyderabad
Full Time
Individual Contributor
Experienced
No
05/05/2026, 8:03:59 AM

Virtusa is a global product and platform engineering services company that makes experiences better with technology. We help organizations grow faster, more profitably, and more sustainably by reimagining enterprises through domain-driven solutions. We combine strategy, design, and engineering, backed by unmatched expertise at the intersection of industry, business, and technology to generate real-world business impact for clients.
Headquartered in Massachusetts with global delivery centers, Virtusa provides a broad range of services, solutions, and assets, including strategy and design, AI advisory and services, digital engineering, data and analytics, digital assurance, cloud and security, cx transformation and managed services across industries such as financial services, healthcare, communications, media, entertainment, travel, manufacturing, and technology.