Job Description
Senior AI Platform Engineer
Build and support a global AI platform in the insurance industry using Azure cloud infrastructure, AI tools and services, and DevOps technologies. This hybrid Toronto-based role focuses on platform engineering, automation, and operational support within a rapidly evolving AI environment supporting enterprise-scale systems.
What is in it for you:
• Salaried: $60-69 per hour.
• Incorporated Business Rate: $70-80 per hour.
• 9-month contract.
• Full-time position: 37.50 hours per week.
• Hybrid: 3 days/week in Toronto office.
Responsibilities:
• Build and operate AI platform services and abstractions that support diverse AI use cases with automation-first delivery.
• Develop reusable reference patterns and inner-source components that meet reliability, security, and compliance standards.
• Implement shared runtimes for multi-agent coordination, state management, memory persistence, and messaging.
• Design interoperable APIs and SDKs used by data scientists and developers to build agent-powered applications.
• Maintain and improve CI/CD pipelines and developer toolchains for AI services.
• Evaluate emerging AI and ML infrastructure capabilities and introduce tools to improve developer productivity and reliability.
• Develop and operate scalable backend services supporting high-traffic agent interactions, retrieval operations, and real-time execution flows.
• Use cloud-native technologies including containers, orchestration, infrastructure as code, and CI/CD to deliver reliable and cost-efficient services.
• Optimize runtime performance across CPU, GPU, and accelerator workloads.
• Develop standardized retrieval frameworks including search, embeddings, and knowledge connectors.
• Build and optimize short-term and long-term memory and episodic state abstractions for agent workflows.
• Integrate structured and unstructured data sources through unified connectors and retrieval bridges.
• Build tool interfaces enabling agents to interact with enterprise systems, APIs, databases, and automations.
• Create reusable patterns for tool definitions, schema validation, safe execution, rate limiting, and auditability.
• Collaborate with regional teams to onboard systems and workflows into the global ecosystem.
• Build and support AI governance platform and service requirements.
• Develop observability capabilities including traces, logs, action tracking, feedback loops, and performance metrics.
• Provide mechanisms for feedback, oversight, and evaluation of agent behavior.
• Build templates, scaffolding, and CLI tools to support development of AI-powered applications.
• Collaborate with global engineering, security, and governance teams to support regulatory and data residency needs.
• Mentor engineering and data science teams on platform capabilities and design patterns.
• Contribute to documentation, playbooks, and enablement resources.
What you will need to succeed:
• Bachelor’s degree in Computer Science, Computer Engineering, or a related technical field.
• 5–7 years of experience in backend, platform, or cloud systems engineering, including experience using Jenkins, GitHub, and Terraform.
• Proficiency with Python and Java, Scala, or TypeScript or similar languages for building backend services and automation, including Java understanding.
• Hands-on experience with Azure cloud infrastructure, including Azure Kubernetes, containers, and CI/CD.
• Understanding of AI tools and services, including LLM systems, retrieval architectures, embeddings, vector stores, prompt or tool orchestration fundamentals, and AI/ML operations including MLOps exposure.
• Strong grasp of API design, asynchronous workflows, concurrency, and system reliability.
• Familiarity with security, governance, and compliance concepts related to AI or data systems.
• DevOps skills including GitHub, Jenkins, and Terraform.
• Ability to collaborate across global teams, translate business problems into platform capabilities, and manage stakeholders effectively.
• Strong communication skills and ability to support day-to-day AI platform operations.
• Ability to work in an evolving environment, help shape foundational processes, tooling, and standards, and take ownership in a fast-moving environment.
• Eagerness to learn and grow with new technologies within the platform and AI ecosystem.
• Ability to support a global program, including after-hours coverage across time zones.
Why Recruit Action?
Recruit Action (agency permit: AP-2504511) provides recruitment services through quality support and a personalized approach. As part of the screening process, some applications may be reviewed using artificial intelligence tools. Only candidates who meet the hiring criteria will be contacted.