IDC

Senior AI Quality/Evaluation Engineer

IDC  •  Warsaw, PL (Hybrid)  •  14 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
77
AI Success™

Job Description

About the Role & Team

IDC is building the next generation of AI-powered intelligence platforms that transform how technology decisions get made. Our platform re-imagines the way decision-makers discover and interact with trusted research and data, and is foundational to IDC's future.

We are looking for a Senior AI Quality/Evaluation Engineer to establish the evaluation function for the platform's AI systems. This is a solo function initially. You will design and build the evaluation infrastructure that ensures the platform produces accurate, well-sourced, high-quality responses. You will be the first hire in this function and must be able to operate independently, defining your own roadmap and building from scratch.

The platform's credibility depends on the quality of its AI-generated intelligence. You will build the automated test suites, regression detection systems, and evaluation frameworks that catch quality issues before they reach users. You will work closely with the product team to translate quality criteria into measurable, automatable test scenarios, and with the AI engineering team to ensure that every pipeline change is evaluated against rigorous standards.

What You’ll Do

  • Design and build the evaluation infrastructure that ensures the platform's AI systems produce accurate, well-sourced, high-quality responses
  • Build automated test suites that validate answer quality across agent pipeline changes
  • Develop regression detection systems that catch quality degradation before it reaches users
  • Create evaluation frameworks that measure response accuracy, citation correctness, and source quality
  • Work closely with the product team to translate quality criteria into measurable, automatable test scenarios
  • Build cost and latency monitoring that tracks the operational efficiency of AI pipeline execution
  • Define evaluation standards and practices that scale as the platform and team grow

What You Bring

  • 6+ years of software engineering experience, with significant work in testing infrastructure, ML evaluation, or quality systems
  • Experience building evaluation or testing frameworks for LLM-based or ML-based systems
  • Understanding of how to measure response quality for generative AI: accuracy, groundedness, citation correctness, relevance
  • Proficiency in Python
  • Ability to operate independently and define your own roadmap. You will be the first hire in this function
  • Experience working at the intersection of engineering and product, translating qualitative quality criteria into quantitative measurements
  • Experience with LLM evaluation frameworks (e.g., RAGAS, DeepEval, or custom)
  • Familiarity with LLM observability tools (e.g., Langfuse, LangSmith, Weights & Biases)
  • Background in statistical methods for quality measurement (significance testing, distribution analysis)
  • Experience building A/B testing or experimentation infrastructure
  • Background in search relevance evaluation or information retrieval metrics

Why This Role Stands Out At IDC, your work helps shape how the world understands technology and where it goes next. You collaborate with curious, high-caliber colleagues who value rigor, integrity, and shared success. As the premier global provider of trusted technology intelligence, IDC equips business and technology leaders with the evidence they need to make confident decisions. Our insights inform strategy, investment, and innovation across industries and regions.

Recognized by IIAR as Analyst Firm of the Year for five consecutive years, IDC sets the standard for credibility and impact. With more than 1,000 analysts worldwide and a truly global perspective, we combine deep expertise with practical relevance. Here, your ideas matter, your voice is heard, and your contributions provide the insights leaders rely on every day. It is meaningful work, backed by a culture that supports growth, collaboration, and long-term career development with a globally respected brand.

What We Offer

  • Hybrid/remote work model (about 1-2 days in the office per month).

  • A position in a highly professional and globally respected market research and advisory firm, where initiative leading to results is rewarded.

  • Individualized Culture: An environment where you can explore new areas outside your specialty and stay engaged with work you enjoy.

Equal Opportunity Employer

IDC is committed to providing equal employment opportunities for all qualified persons. Employment eligibility verification required. We participate in E-Verify.

#LI-SJ1

IDC

About IDC

IDC is the premier global provider of market intelligence, advisory services, and events for the information technology, telecommunications, and consumer technology markets. IDC helps IT professionals, business executives, and the investment community make fact-based decisions on technology purchases and business strategy. More than 1,300 IDC analysts provide global, regional, and local expertise on technology and industry opportunities and trends in over 110 countries worldwide. For more than 50 years IDC has provided strategic insights to help our clients achieve their key business objectives. IDC’s Insights businesses provide industry-focused advice for IT buyers in the Financial, Government, Health, Retail, Manufacturing and Energy verticals. To learn more about IDC, please visit www.idc.com. Follow IDC on Twitter at @IDC and LinkedIn. Subscribe to the IDC Blog for industry news and insights.

Industry
Research & Polling
Company Size
5,001-10,000 employees
Headquarters
Needham, Massachusetts
Year Founded
Unknown
Website
idc.com
Social Media