Job Description
Senior technical leader and architectural owner for ARIP Layers 2–4: Semantic Fabric (KGIP consumption from DEAA), Decisioning (ARIP-side D-01/D-04/D-11–13), and Agent Runtime (LangGraph / CrewAI, HITL gates, Trust Gate Framework, observability). Co-chairs the Architecture Review Board. Leads 5–6 ICs including 2 AI Engineers and 1 ML Engineer.
Remote candidates outside of Thailand are welcome to apply.
Key Responsibilities:
- Own Layer 2–4 reference architecture; co-chair Architecture Review Board (fortnightly) with Head of ARIP; author ADRs for D-004 (KG build/buy), D-005 (agent framework), D-006 (LLM provider)
- Build and own Agent Runtime (Layer 4): framework deployment, Trust Gate Framework (Shadow → Recommender → Executor), Agent Registry, HITL gates, OpenTelemetry + Langfuse observability
- Finalise KGIP consumption contract with DEAA: KG access, Semantic Layer API, Data Product Catalogue subscription, Event Stream ingestion
- Build Layer 3 ARIP-side decisioning: D-01 Decision Orchestrator, D-04 Trust Gate Service, D-11/D-12/D-13 agent-side helpers; consume DEAA's LLM Gateway (D-02) and Vector Search (D-03)
- Own per-agent cost meter, Layer 4 SLOs (P95 invocation, success rate, HITL response time), and incident response for runtime failures
- Lead and mentor 5–6 ICs (platform-side Senior SWEs, 2 AI Engineers, 1 ML Engineer) on agent engineering discipline and eval-driven development
Requirements
- 8+ years software engineering; 3+ years as Tech Lead / Staff owning platform standards with architectural decision authority
- Production multi-agent orchestration with HITL gates and eval-driven CI — LangGraph / CrewAI / AutoGen or equivalent (not just RAG demos)
- Expert with at least one major LLM provider (Azure OpenAI / Anthropic / Bedrock / Vertex) in production with cost and latency optimisation
- Strong distributed systems, observability (OpenTelemetry, Langfuse), and API-first service contract discipline
- Knowledge graph or semantic-layer production experience preferred (KGIP consumption design requires it); ADR authorship and ARB facilitation at senior level
- Calibre: Staff/Principal Engineer from Agoda, Grab, Shopee, AI-native startups (Anthropic-adjacent) with multi-agent production experience