
Location: New York, NY (On-site)
Company Stage of Funding: Early-stage / Seed-backed
Office Type: On-site
Salary: $210,000 – $250,000 + Equity (0–0.4%)
Visa: No sponsorship available
Our client is building the foundational legal data infrastructure powering the next generation of AI systems.
Their platform processes and structures millions of U.S. court records and legal filings, serving as the legal data layer for AI labs, legal AI startups, and enterprise legal workflows. The company already powers legal reasoning workflows for hundreds of law firms and multiple large-scale AI organizations.
This is an opportunity to join a highly technical early-stage team building RL environments, evaluation harnesses, and benchmark systems for long-horizon legal reasoning tasks — infrastructure that will directly shape how future legal AI models are trained and evaluated.
The role sits at the intersection of reinforcement learning infrastructure, evaluation systems, data pipelines, and large-scale document processing.
Build and maintain RL environment infrastructure for long-horizon legal reasoning tasks
Design scalable evaluation harnesses, task runners, scoring systems, and sandboxed execution environments
Build systems that convert raw legal filings and court records into benchmark and RL training tasks
Develop contamination-free evaluation pipelines for frontier AI model testing
Integrate with partner model APIs and evaluation harnesses
Collaborate with attorneys and domain experts to translate legal workflows into structured evaluation tasks
Work with messy, large-scale, real-world document datasets including PDFs and long-form legal filings
Build scalable data pipelines for legal reasoning environments
Develop tools for search, retrieval, reasoning, and drafting evaluations
Write production-quality Python systems with strong engineering rigor
Contribute to infrastructure supporting thousands of concurrent agent evaluations
Leverage AI coding tools such as Cursor, Claude Code, and Codex in daily workflows
Collaborate closely with engineers, attorneys, and AI partners
Operate with high ownership in a lean, fast-moving engineering environment
Take full ownership of systems from 0 → 1
3–8 years of software engineering experience
Strong Python engineering fundamentals
Experience building production systems with high ownership
Experience building systems from 0 → 1
Comfortable working with large-scale document or data processing systems
Strong backend and infrastructure engineering intuition
Experience working in high-signal startup or engineering environments
Strong product ownership mindset
Comfortable operating as a highly autonomous IC
Experience with AI coding tools in daily workflow
Strong debugging and systems thinking ability
Comfortable working with ambiguous and evolving requirements
Strong written and verbal communication skills
Ability to move quickly while maintaining engineering quality
Comfortable working onsite in NYC
Founding engineer or startup founder experience with demonstrated traction
Experience building evaluation systems or benchmarking infrastructure
Experience with RL environments or agent evaluation systems
Experience with LLM evaluations or AI evaluation frameworks
Experience with modern AI tooling and workflows
Experience building scalable Python backend systems
Experience with large-scale data or document pipelines
TypeScript experience
Experience working with messy real-world datasets
Strong engineering ownership and initiative
Experience collaborating with highly technical teams
High agency and startup intensity tolerance
Evidence of rapid career growth or exceptional ownership
Strong systems design and infrastructure intuition
Base salary: $210,000 – $250,000
Equity package up to 0.4%
Direct ownership over core AI infrastructure systems
High-impact role at an early-stage AI infrastructure company
Work directly with frontier AI labs and legal domain experts
Lean, highly technical engineering team
Exposure to RL systems, evaluation harnesses, and large-scale AI infrastructure
Fast-growing company with strong commercial traction
Opportunity to shape the future of legal AI evaluation systems
This is an opportunity to build foundational RL infrastructure and evaluation systems for frontier legal AI.
You’ll work on difficult, high-leverage engineering problems involving long-horizon reasoning, document intelligence, scalable evaluation environments, and AI benchmarking systems.
If you want deep technical ownership, exposure to frontier AI infrastructure, and the opportunity to help define how future AI systems are evaluated and trained, this role offers exceptional scope and leverage.

Recruiting from Scratch provides recruiting services for companies that need to hire the best talent in software engineering, hardware engineering, product design, product management, marketing, GTM, and accounting & finance.