Job Description

Who we are

Our client operates in a highly technology-driven environment, where digital solutions play a key role in shaping internal processes and external interactions. As part of an ongoing transformation journey, there is a strong focus on enhancing engineering capabilities, adopting agile delivery models, and modernizing the existing IT landscape through strategic, long-term investments, including cloud technologies.

As Site Reliability Engineer you will contribute to the overarching implementation and operation of our client's Online Banking platform inthe Google Cloud to become a central part of the feature-squads, based on the paradigm "you built it you run it".

Location: Bucharest

What You’ll Be Doing

Define Service Level Objectives (SLOs), and enable an end-to-end view on customer satisfaction based on best practices for setting up Service
Level Indicators (SLIs) to create effective strategies for maintaining and improving system performance and availability
Collaborate with Business Functional Analysts and Solution Architects to find improvements in the solution design to improve the resilience of technical solutions early on
Consult and guide the squad on the prioritization of reliability improvement and actively deliver them as part of the sprint
Hands-on experience in implementing reliability and resilience patterns like auto-scaling, curcuit breakers, bulk-heads, rate limiter, retry mechanisms, etc.
Actively work on service request fulfilment, incident and problem mgmt. to identify and reduce toil and the MTTR with engineering best practices
Align and contribute on state-of-the-art SRE best practices e.g. Distributed Tracing, Open Telemetry and Chaos Engineering with the SRE chapter function
Be a knowledge- and skill multiplicator of your profession by being a Lead of the Site Reliability engineer population
Increase the seniority of the overall Site Reliability Engineer chapter by establishing events and procedures, and foster a culture of high standards
Lead people of your engineer profession and make them become better each day

What We’re Looking For

Bachelor’s degree in Computer Science, Engineering, or related field
Minimum 5 years proven work experience as a Reliability Engineer or similar role
Expert knowledge and hands-on experience with applications hosted on cloud platforms such as Google Cloud Platform as well as withDocker / Kubernetes in combination with with Google Kubernetes, Engine (GKE), Terraform or similar technology
Experience in resilient software development in Python/JAVA and the usage of modern CI/CD pipelines e.g. Github, Github Actions, Bitbucket, Helm
Strong experience in the setup of observability, monitoring and self-healing solutions for instance with New Relic, Splunk, Google Cloud, Operations, Lightstep and Ansible
Very good knowledge of security standards (e.g.: TLS, OAuth2, KMS, Vault, Admission Controllers, let's encrypt), microservice architectures and experience with API Management with Apigee or WSO2
Proactive attitude and collaborative Team player mindset paired with self confidence
Not loosing your coolness and keep your eye for details even in stressful situations where time matters
Having a creative approach towards solving technical problems
Excellent communication skills in English

About NTT DATA Romania

Industry

Unknown

Company Size

Unknown

Headquarters

Unknown

Year Founded

Unknown

Website

nttdata.ro

Social Media