Job Description
In this position you will…
…owning and driving quality for AI-powered applications and platforms. You will partner with Product, Engineering, Data/ML and Security to define test strategy, quality gates and acceptance criteria, then design and execute testing across APIs, UI, integrations and LLM components (prompting, RAG and agent workflows). You will validate functional correctness and GenAI-specific risks such as hallucinations, unsafe outputs, data leakage, latency and cost, ensuring releases are reliable, scalable and compliant for enterprise, multi-tenant use.
You will be responsible for…
- Define and maintain end-to-end QA strategy, test plans, test cases and quality metrics for AI applications (web, APIs, services and integrations).
- Partner with Product and Engineering to translate requirements into clear acceptance criteria, including GenAI behaviors (grounding, citations, refusal handling, tone and language).
- Execute functional testing (smoke, regression, UAT support) across UI, API and backend services; validate workflows, permissions, and multi-tenant behavior.
- Design and implement test automation for UI/API and service layers; integrate automated checks into CI/CD pipelines and enforce release quality gates.
- Perform risk-based testing eg. hallucination, toxicity, prompt injection, data leakage (PII/PHI), guardrails, redaction, etc.
- Test non-functional requirements: performance/latency, reliability, concurrency, and observability (logs/traces).
You will require the following qualifications and skills
- 5+ years of QA experience (manual and automation) for web applications, APIs and distributed systems; experience in Agile delivery teams.
- Strong knowledge of test design techniques (risk-based testing, boundary/value analysis, exploratory testing) and defect lifecycle management.
- Hands-on test automation experience with tools such as Playwright/Cypress/Selenium for UI and Postman or equivalent for API testing is a strong plus.
- Proficiency in at least one programming/scripting language (JavaScript/TypeScript, Python or Java) to build and maintain automation and test utilities is a strong plus.
- Experience testing GenAI/LLM applications, with practical understanding of prompts, tokens, temperature, embeddings, RAG, tool/agent calling, and common failure modes (hallucination, prompt injection).
- Excellent communication skills and ability to collaborate with cross-functional stakeholders (Product, Engineering, Data/ML, Ops).