Razer Inc.

Senior Site Reliability Engineer

Razer Inc.  •  Singapore, SG (Onsite)  •  5 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
59
AI Success™

Job Description

Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.

Job Responsibilities :

We are looking for Senior Site Reliability Engineers (SRE) to join our AI Software team. In this role, you will ensure the reliability, performance, scalability, and operational excellence of AI products, model-serving infrastructure, and backend API systems. You’ll work closely with software engineers, AI teams and release teams to automate operations, enhance observability, and streamline deployments in a cloud-scale environment. This role is ideal for someone who enjoys building resilient systems, solving complex infrastructure problems, and supporting AI workloads in production.

EssentialDuties and Responsibilities

  • Administer,monitor, and manage cloud-scale production environments for AI model APIs, backend services, and high-traffic web systems serving global users.

  • Design and implement fault-tolerant, autoscaling cloud architectures tailored for AI inference workloads, including GPU-based environmentsand software products

  • Build automated self-recovery systems to ensure high availability, rapid failover, and cost-efficient resource usagefor all software products.

  • Manage andmonitorAI model-serving platforms, inference engines, vector databases,data pipelines, software applications

  • Ensure reliability and uptime for experimental,productionAIsoftware environments.

  • Implement andmaintaincomprehensive monitoring, logging, and alerting for all AI and backend services.

  • Reduce MTTR through actionable alerts, runbooks, and automated diagnostics.

  • Automate infrastructure usingIaC(Terraform/CloudFormation) and configuration management.

  • Improve release workflows and integrate with QA for smooth handoff to Release Candidate testing.

  • Work closely with software engineering, ML engineering, and release management to enhance operational procedures, deployment processes, and incident response workflows.

  • Participate in on-call rotations, incident reviews, and continuous improvementinitiatives.

Pre-Requisites :

Qualifications

  • 5+ years of relevant experienceinSRE, DevOps, infrastructure engineering, or cloud operations

  • Experience operating production services with significant availability or scaling demands.

  • Strong knowledgeinWeb Technologies such as HTTP, REST, SSL, Load Balancers, Web Proxies (NGINX)

  • Comfortable with Linux and Docker administration

  • Basic knowledge in AWS, CI/CD (Jenkins),IaC(Terraform), Container Orchestration (AWS ECS or K8s), Version Control (Git), Database (mySQL,noSQL)

  • Strong ability to code and script( preferablyBash scripting and Python)

  • Ability to use or quickly pick up a wide variety ofopen sourcetechnologies and automation tools

  • Understanding ofGPU-based workloads and resource scheduling.

  • Familiarity with vector databases, embeddings, and inference pipeline

  • Comfort with frequent, incremental code testing and deployment

  • Must have good analytical skills to debug deployment problems without taking help from developers

  • Deep hands-on technicalexpertiseand problem-solving skills

  • Ability to work in a collaborative, technically challenging environment with rapidly changing requirements

Education & Experience

  • Has aBachelor’s or Master’sdegreein computer science,AIor similar disciplinefrom an accredited institution

Travel Requirements

  • Role based in Singapore officeand may require up to 1 travel trip per year.

Razer is proud to be an Equal Opportunity Employer. We believe that diverse teams drive better ideas, better products, and a stronger culture. We are committed to providing an inclusive, respectful, and fair workplace for every employee across all the countries we operate in. We do not discriminate on the basis of race, ethnicity, colour, nationality, ancestry, religion, age, sex, sexual orientation, gender identity or expression, disability, marital status, or any other characteristic protected under local laws. Where needed, we provide reasonable accommodations - including for disability or religious practices - to ensure every team member can perform and contribute at their best.

Are you game?

Razer Inc.

About Razer Inc.

Razer™ is the world’s leading lifestyle brand for gamers.

The triple-headed snake trademark of Razer is one of the most recognized logos in the global gaming and esports communities.

With a fan base that spans every continent, the company has designed and built the world’s largest gamer-focused ecosystem of hardware, software and services.

Razer’s award-winning hardware includes high-performance gaming peripherals and Blade gaming laptops. Razer’s software platform, with over 70 million users, includes Razer Synapse (an Internet of Things platform), Razer Chroma™ (a proprietary RGB lighting technology system), and Razer Cortex (a game optimizer and launcher).

In services, Razer Gold is one of the world’s largest virtual credit services for gamers, and Razer Fintech is one of the largest online-to-offline digital payment networks in SE Asia.

Founded in 2005 and dual-headquartered in Irvine and Singapore, Razer has 18 offices worldwide and is recognized as the leading brand for gamers in the USA, Europe and China.

Industry
Hardware & Semiconductors
Company Size
1,001-5,000 employees
Headquarters
Irvine, CA
Year Founded
2005
Website
razer.com
Social Media