AIA

System Reliability Engineer, Consultant

AIA  •  Kuala Lumpur, MY (Onsite)  •  3 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

At AIA we’ve started an exciting movement to create a healthier, more sustainable future for everyone.

As pioneering innovators for over 100 years, we’re now transforming our organisation to be faster, simpler and more connected. Because we want to be even better equipped to develop digital solutions and experiences that help more people live Healthier, Longer, Better Lives.

To get there, we need people with tech/digital/analytics expertise and passion to help develop positive, sustainable change through digitally enhanced experiences that will impact the lives of millions of people and create a healthier future for everyone.

If you believe in developing a better tomorrow, read on.

About the Role

We are looking for a System / Site Reliability Engineer (SRE) to help ensure the reliability, scalability, and performance of our enterprise systems and services. In this role, you will apply software engineering principles to operations, partner closely with development and infrastructure teams, and build automation that strengthens system stability and efficiency. You will play a pivotal role in bridging the gap between software development and IT operations, driving a culture of resilience, observability, automation, and proactive problem‑solving.

Key Responsibilities

1. Ensure System Reliability & Availability

  • Monitor and report on application performance, and highlight any deviations or issues.

  • Collaborate with application engineers and developers to identify root causes and implement durable fixes.

2. Incident Management & Root Cause Analysis

  • Participate as a Subject Matter Advisor during production incidents and outages.

  • Provide insights backed by system monitoring, code review, and database analysis.

  • Support post‑mortem reviews and drive follow‑up actions.

3. Automation & Tooling

  • Automate operational tasks such as monitoring, alerts, and recovery processes.

  • Build scripts and internal tools to eliminate manual toil and improve operational efficiency.

4. Monitoring & Observability

  • Implement telemetry and observability practices to track system health, latency, and error rates.

  • Manage the Dynatrace platform and its integrations with application services.

  • Support teams in designing dashboards and visualization setups.

5. Security & Compliance

  • Work with Security teams to ensure systems comply with regulatory and industry standards (e.g., PCI‑DSS, GDPR).

  • Implement necessary access controls, encryption, and audit capabilities within SRE scope.

6. Capacity Planning & Performance Optimization

  • Analyze usage trends to forecast demand and support scaling decisions.

  • Contribute to cost‑performance optimization efforts across infrastructure and applications.

  • Collaborate closely with development, QA, and infrastructure teams to embed reliability into the SDLC.

7. Documentation & Knowledge Sharing

  • Maintain clear and up‑to‑date operational documentation, runbooks, and architecture diagrams.

  • Champion SRE principles across the organization to foster resilience and accountability.

Job Requirements

Education

  • Bachelor’s degree in Computer Science, Software Engineering, IT, or related fields.

Experience

  • 3–5 years of experience in SRE, DevOps, or Software Engineering roles.

  • Experience supporting front‑end applications in production environments, ideally within financial services or other regulated industries.

Technical Skills

  • Strong understanding of front‑end performance monitoring and instrumentation.

  • Hands‑on experience with Real User Monitoring (RUM), Synthetic Monitoring, and APM tools (e.g., Dynatrace, New Relic, Datadog).

  • Proficiency in building dashboards and alerts using Dynatrace, Grafana, Prometheus, Elastic Stack, or Splunk.

  • Familiarity with OpenTelemetry for distributed tracing.

  • Scripting skills in Python, Bash, or JavaScript.

  • Experience with CI/CD pipelines (e.g., GitHub Flow).

  • Practical experience with cloud technologies (AWS or Azure).

  • Knowledge of Docker and Kubernetes.

  • Understanding of secure coding practices for front‑end applications.

  • Awareness of financial compliance standards such as PCI‑DSS.

Why Join Us?

  • Be part of a high‑impact team shaping system resilience across the enterprise.

  • Work with modern observability and automation technologies.

  • Influence engineering culture through SRE best practices.

  • Opportunities to innovate and drive real improvements in system reliability.

Build a career with us as we help our customers and the community live Healthier, Longer, Better Lives.

You must provide all requested information, including Personal Data, to be considered for this career opportunity. Failure to provide such information may influence the processing and outcome of your application. You are responsible for ensuring that the information you submit is accurate and up-to-date.

AIA

About AIA

AIA Group Limited and its subsidiaries (collectively “AIA” or the “Group”) comprise the largest independent publicly listed pan-Asian life insurance group. It has a presence in 18 markets – wholly-owned branches and subsidiaries in Mainland China, Hong Kong SAR(1), Thailand, Singapore, Malaysia, Australia, Cambodia, Indonesia, Myanmar, New Zealand, the Philippines, South Korea, Sri Lanka, Taiwan (China), Vietnam, Brunei and Macau SAR(2), and a 49 per cent joint venture in India. In addition, AIA has a 24.99 per cent shareholding in China Post Life Insurance Co., Ltd.

The business that is now AIA was first established in Shanghai more than a century ago in 1919. It is a market leader in Asia (ex-Japan) based on life insurance premiums and holds leading positions across the majority of its markets. It had total assets of US$328 billion as of 30 June 2025.

AIA meets the long-term savings and protection needs of individuals by offering a range of products and services including life insurance, accident and health insurance and savings plans. The Group also provides employee benefits, credit life and pension services to corporate clients. Through an extensive network of agents, partners and employees across Asia, AIA serves the holders of more than 43 million individual policies and over 16 million participating members of group insurance schemes.

AIA Group Limited is listed on the Main Board of The Stock Exchange of Hong Kong Limited under the stock codes “1299” for HKD counter and “81299” for RMB counter with American Depositary Receipts (Level 1) traded on the over-the-counter market under the ticker symbol “AAGIY”.

(1) Hong Kong SAR refers to the Hong Kong Special Administrative Region.

(2) Macau SAR refers to the Macau Special Administrative Region.

Industry
Finance & Insurance
Company Size
10,000+ employees
Headquarters
Central, HK
Year Founded
Unknown
Website
aia.com
Social Media