Job Description
Skills: Python, React, React Native, Data Analytics, Evals
The Role
We're hiring a product engineer who owns features end to end — from a loosely defined customer problem to shipped, measured, iterated product. You'll work across the stack on Terra, our AI property intelligence platform conversational answers over land records, risk-graded title diligence, outcome-led verification products, and the surfaces where AI judgment meets real users making expensive, irreversible decisions.
This is not a ticket-execution role. You'll talk to users, decide what to build, ship it fast, and watch what happens.
What You'll Work On
-
AI-native product surfaces. Chat and search experiences over property records - streaming responses, citations and provenance display, multilingual UX, and graceful handling of model uncertainty. The hard part isn't calling an API; it's designing how AI output earns user trust.
-
Intelligence products people pay for. Verification packs, diligence reports, and the purchase flows behind them — packaging AI-derived judgment into outcomes users trust enough to buy, where clarity and reliability directly drive revenue.
-
Full-stack feature ownership. Frontend (React/React Native), backend services (Python/Node), and the APIs that connect product to our ML and data pipelines.
-
Evals as product infrastructure. Building evaluation suites that measure whether Terra's agents actually help users — reading transcripts to find capability gaps, defining target behaviors, turning production failures into test cases, and giving the ML team evidence-backed priorities instead of vibes.
-
Speed with judgment. Prototyping with AI coding tools, instrumenting what you ship, and killing or doubling down based on usage data.
What We're Looking For
- 4–8 years shipping consumer or B2B product as a full-stack or frontend-leaning engineer, ideally at a startup or on a fast-moving product team.
- Strong product instincts — you've made calls about what to build, not just how, and you can defend them with user evidence.
- Solid TypeScript/JavaScript and React; comfort with at least one backend stack (Python or Node) and relational databases.
- Experience building on LLM APIs or shipping AI-powered features is a strong plus — especially streaming UX, structured outputs, or agentic flows.
- Hands-on experience evaluating AI systems: you've built eval sets or task suites, dug through model transcripts to diagnose failures, and can articulate specifically what model behaviors you'd change and why.
- A systems thinker — when you find a problem, you build the instrumentation or eval that catches its whole class, not just the one instance.
- Care for craft: you sweat copy, states, edge cases, and latency, because in a trust product the details are the product.
- Comfort with ambiguity, multilingual users, and the messiness of Indian government data.
Nice to Have
- Payments/checkout experience in the Indian market (UPI, gateways, refunds).
- Mobile experience (React Native or native).
- Familiarity with agentic eval frameworks or benchmark-style task suites (SWE-bench-style or similar), and daily use of AI coding agents with opinions about where they fall short.
- Design sensibility — you can take a feature from rough idea to polished UI without waiting on a designer.
- Experience in fintech, legal tech, or other domains where users make high-stakes decisions.
Your First 90 Days
Days 1–30: Ship and absorb. Land your first production change in week one. Talk to users — buyers, brokers, lenders — and read agent transcripts until you can name the top three places Terra loses user trust. Own one product surface end to end.
Days 31–60: Own an outcome. Take a revenue- or trust-critical flow — a verification product, the conversational experience, a diligence report — and move its metric: conversion, completion, satisfaction, or eval pass rate. Stand up or extend the eval/instrumentation that proves it.
Days 61–90: Set direction. Propose the next product bet backed by what you've seen in usage data and transcripts, and start building it. By now you should be the person the team asks "what should we do here?" for your surface — with evidence, not opinions.
Why This Role
- You'll ship to real users solving a problem that matters — property fraud and opaque titles cost Indian families dearly.
- Small, senior team with no layers between you and customers; your decisions reach production in days.
- A rare chance to define what AI-native product UX looks like for a category nobody has built before.
- Backed by Y Combinator and top investors, with real revenue and growing usage.