Job Description
Who is Recruiting from Scratch
Recruiting from Scratch is a premier talent firm that focuses on placing the best product managers, software, and hardware talent at innovative companies. Our team is 100% remote and we work with teams across the United States to help them hire.
Title of Role: Member of Technical Staff (AI Benchmarking)
Location: San Francisco, CA (On-site, 5 days/week)
Company Stage of Funding: Seed (~$3.5M raised, strong investor backing)
Office Type: On-site
Salary: $130,000 – $220,000 + Equity ($60K–$120K options/year)
Visa: Visa sponsorship available on a case-by-case basis (Australia priority; others considered)
Our client is a fast-growing AI benchmarking and intelligence company that has become one of the most important independent evaluators of frontier AI systems.
The company works directly with leading AI labs including OpenAI, Google, Anthropic, Meta, and NVIDIA, helping define how AI systems are measured, compared, and understood across the industry.
Their benchmarks and insights are widely used by enterprises, researchers, investors, and policymakers — and are actively shaping the direction of AI development itself.
Backed by industry leaders including Nat Friedman (GitHub), Andrew Ng, Daniel Gross, Adam D’Angelo, and Clem Delangue, the company is already trusted by hundreds of thousands of users and is on track to double its team.
This is a rare opportunity to join a highly influential AI company at the frontier of model evaluation, benchmarking, and AI systems analysis.
What You Will Do
- Design and execute AI benchmarking and evaluation projects
- Develop new methodologies for evaluating AI models and agentic systems
- Build datasets and analytical frameworks for frontier AI assessment
- Analyze AI system performance across models, tools, and hardware
- Produce strategic reports and insights for enterprises and AI labs
- Work directly with leading AI labs on model evaluation and benchmarking
- Identify gaps in current AI evaluation systems and design solutions
- Collaborate with engineers to improve benchmarking infrastructure
- Communicate complex AI concepts through clear analysis and visualization
- Contribute to company strategy and product direction
- Operate in an AI-native workflow using cutting-edge tools
- Help define what “state-of-the-art AI” actually means in practice
Ideal Candidate Background
- 2–10 years of experience in consulting (MBB) or technical roles (SWE, ML, TPM, data roles)
- Strong Python proficiency with recent hands-on coding experience
- Strong analytical and structured thinking ability
- Experience building or working with data analysis frameworks
- Comfortable working in ambiguous, research-heavy environments
- Strong written and verbal communication skills
- High intellectual curiosity and ability to learn quickly
- Comfortable working directly with AI labs and technical stakeholders
- Strong ownership mindset
Preferred
- MBB consulting background (especially AI / analytics practices like BCG X, QuantumBlack)
- Experience at AI labs or AI-native companies
- Background in ML, data science, or applied research
- Experience with benchmarking, evaluation systems, or experimentation frameworks
- Strong GitHub or portfolio of coding projects
- Exposure to frontier AI systems (LLMs, agents, multimodal models)
- Experience at high-growth technical startups
- Ability to translate technical findings into strategic insights
Strong Signals
- Ex-MBB (especially AI/analytics teams)
- Experience at DeepMind, Meta AI, Google, Cohere, Mistral
- Strong Python + analytical coding ability
- Experience building datasets or evaluation pipelines
- Exposure to AI product or research workflows
- Strong academic or technical pedigree
- Evidence of high intellectual output (writing, research, GitHub, projects)
Compensation and Benefits
- Base salary: $130,000 – $220,000
- Equity: $60K–$120K/year in options
- Visa sponsorship (case-by-case)
- Relocation support available
- Direct exposure to leading AI labs globally
- High-impact, externally visible work
- Significant upside as the company scales
Why Join
This is a rare opportunity to work at the center of the AI ecosystem — building the benchmarks that define how frontier models are evaluated and understood.
You’ll work directly with the world’s leading AI labs, shape evaluation methodologies used across the industry, and help define what “state-of-the-art AI” means in practice.
If you’re highly analytical, technically strong in Python, and excited by frontier AI systems, this role offers exceptional visibility, impact, and career acceleration.