Job Description
A. PROFILE
Role Title: Site Reliability Senior Engineer
Reporting to: Engineering Manager - DevOps
Division: Information & Communication Technology
Department / Section: Technology & Information
B. CONTEXT
Purpose This role is responsible for contributing in the planning team of ICT. This includes strategic planning, solutions roadmaps, capacity planning, and Innovation.
Context The Technology Unit within OM is the backbone of the organization providing all technology services which enable OM to deliver its services to its customers across all technology platforms, 24/7/365. The quality of the customer experience sits within this BU and therefore it plays a significant role in the delivery of revenue and satisfaction targets.
ICT Planning plays a vital role in this context by ensuring that ICT systems fulfill demand needs, and that ICT strategy is aligned with U9 business strategy and vision.
C. ROLE ACCOUNTABILITIES
- Lead the design, development, and maintenance of a robust and efficient DevOps pipeline to enable continuous integration and delivery of software products.
- Configure and manage automation tools such as Ansible to streamline deployment and configuration management processes.
- Containerize applications using Docker and orchestration tools to enable scalability and portability.
- Maintain and enhance version control systems, primarily Git, to ensure smooth code collaboration and version control.
- Plan and implement integration with multiple third-party systems such as infra, core, ICT, public cloud etc.
- Develop and maintain microservices using Python, adhering to best practices and coding standards.
- Utilize the expertise in Oracle Linux and SQL to optimize database performance and troubleshoot issues.
- Collaborate closely with software developers, providing on-time support and deploying micro-service solutions in the IT environment.
- Plan and scale for multiple applications, ensuring efficient development, maintenance, and performance tuning
- Monitor system performance, analyse metrics, and implement proactive measures to ensure high availability and scalability.
- Conduct application performance analysis and reporting for environment-related matters.
- Participate in incident management and root cause analysis, identifying and resolving issues to minimize downtime and improve system reliability.
- Work with industry collaborators or research institutes for the potential new business stream for automation, process efficiency and so on.
- Undertake any other related or ancillary duties and responsibilities assigned based on U9 business and operational needs.
D. KEY PERFORMANCE INDICATORS
- Time to market for IT application environment
- Scalability of IT application which will be elastic to scale up and down
- Seamless runtime for IT application >=99% after application go live
- ALL system to be update with latest security patches
E. WORKING RELATIONSHIPS & DECISION MAKING
Interacts with:
Internal:
- Infrastructure team, IT/Network team
- Software development team
- ICT demand team
- ICT Operation team
External:
- Infrastructure vendor
- Security Vendor
Decision Making
- Impact analysis approval
- Solution design approval
- Security path and assurance approval
F. EXPERIENCE AND QUALIFICATIONS
Minimum Experience & Essential Knowledge
- Proven knowledge in translating business requirements into operating technologies
- 3 to 5 years of relevant experience in telecom industry.
- Good experience in system administration.
Minimum Entry Qualifications