
Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.
We are looking for highly analytical engineers and technical domain experts to contribute to advanced AI evaluation and benchmarking projects focused on realistic terminal-based and infrastructure-heavy workflows In this role, you will design technically challenging tasks that evaluate how AI systems reason through debugging, operational failures, complex workflows, and multi-step problem-solving scenarios.
The ideal candidate has strong experience working with production systems, debugging, automation, or large-scale engineering workflows, and can design realistic technical challenges that simulate real-world engineering environments.
This role is particularly well suited for professionals with backgrounds in backend engineering, infrastructure, DevOps, data systems, MLOps, cybersecurity, or platform engineering.
CONTRACT: Contractor assignment (5 weeks)
COMMITMENT: Full-time (40h/week) or Part-time (20h/week) with minimum 4h PST overlap
LOCATION: Remote — Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Pakistan, Indonesia, Kenya, Nigeria, Turkey, Vietnam
PROCESS: One technical assessment/interview (~45 min)
Responsibilities:
Requirements

Gramian Consulting is your partner for accessing the engineering capabilities you need—delivered in the model that fits your business, from staff augmentation and talent recruiting to Build-Operate-Transfer (BOT). We combine the perspective of a software engineer, the rigor of a technical recruiter, and the vision of a business builder, so you get experts who understand your challenges and deliver results the right way. This blend is our signature advantage in providing top-quality services, fast and reliably.