mthree

Lead Site Reliability Engineer

mthree  •  $140k/yr  •  Charlotte, NC / Plano, TX / Jersey City, NJ (Onsite)  •  4 hours ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

**Looking for local candidates**

Want to work in technology in the financial industry?

Our client is seeking a highly motivated Site Reliability Engineer to join a dynamic team, Global Banking Technology is building and scaling Site Reliability Engineering (SRE) across a large, highly regulated banking environment. We are seeking a senior SRE practitioner to lead and accelerate transformation from traditional L2 production support toward an SRE operating model. This role will help define, implement, and embed SRE practices across infrastructure and banking services, enabling measurable reliability outcomes, reduced manual toil, stronger automation, and improved service visibility. The successful candidate will bring proven, hands-on experience implementing SRE in a large corporate bank and will be able to influence operations, engineering, and product partners to institutionalize SRE practices on a scale.

About mthree:

Since 2010, mthree has been helping clients solve their business and technological challenges. We are a technology and business consultancy with a global workforce delivering significant business and IT projects in some of the largest financial services organizations worldwide.

Core Services:

  • Consulting and Advisory
  • Managed Services
  • Alumni Graduate Program
  • Alumni Pro Program

We have a global presence and are experts in delivering exceptional quality to our client base, providing consulting services across Risk, Regulation & Compliance; Vendor Products; Application Support; Application Development; Cyber & Information Security; Data Science and DevOps areas.

Our Expert program offers experienced professionals access to top roles in tech, finance, aviation and insurance. Join us to work on groundbreaking technology projects, from international trading platforms to critical applications for leading airlines. We recruit professionals who are eager to fast-track their careers in technology or operations within prestigious global organizations.

Key Responsibilities

SRE Operating Model and Transformation

  • Lead the design and execution of the SRE adoption approach across Global Banking, including the transition path from traditional L2 support to reliability engineering.
  • Establish practical engagement patterns between SRE, application teams, and platform teams and help teams adopt a consistent way of working.

Reliability Measurement and Decisioning

  • Drive adoption of Critical User Journeys, Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets for priority services, ensuring metrics reflect user experience and business outcomes
  • Help teams implement error budget based decisioning that balances reliability, delivery velocity, and operational risk

Toil Reduction, Automation, and Engineering Excellence

  • Identify operational toil and lead initiatives to eliminate it through automation, self-healing patterns, runbook automation, and operational tooling improvements
  • Establish and implement a model to partner with engineering teams to build reliability into services through design improvements, improved instrumentation, and resilience patterns

Incident and Problem Management Excellence

  • Improve production outcomes through strong incident response practices, including major incident triage support, root cause analysis, post incident reviews, and preventive engineering actions.
  • Strengthen problem management with a focus on reducing repeat incidents, technical debt risk, and manual intervention.

Observability and Tooling Enablement

  • Establish practical observability standards across logs, metrics, traces, dashboards, and alerting to reduce noise, improve signal quality, and shorten time to detect and restore service.
  • Partner across platform, tooling, and service management teams to align SRE needs to enterprise tooling and processes
  • Work with tools like Splunk, Dynatrace, OTEL and instrument end to end observability for services, ensuring teams are able to adopt and use the platforms

Stakeholder Management and Change Leadership

  • Influence leaders across operations, engineering, and product to adopt SRE principles and measurable reliability goals
  • Communicate clearly with senior stakeholders, including executive updates on progress, adoption, and outcomes

Required Qualifications

  • ~10–15+ years inSRE, software engineering, or infrastructure engineering
  • Significant experience in hands-on Site Reliability Engineering and implementing SRE practices across large scale, complex services in essential
  • Demonstrated experience leading an SRE transformation in a corporate banking environment (or similarly regulated financial services enterprise).
  • Proven ability to implement and scale SLO/SLI and error budget approaches, and to operationalize them across multiple teams and services.
  • Strong engineering background with the ability to drive automation and reduce manual toil through code, tooling, and process redesign.
  • Deep knowledge of incident response, problem management, root cause analysis, and operational resilience practices in mission critical environments.
  • Strong stakeholder management skills, able to influence technology and business partners and communicate effectively at senior levels.

Key Competencies

  • Transformation leadership in complex, matrixed environments
  • Strong engineering judgment and pragmatic problem solving
  • Ability to simplify, standardize, and scale operating practices
  • Calm and effective leadership during production events
  • Excellent written and verbal communication

Preferred Qualifications:

  • Experience with high-availability banking platforms and 24x7 operational expectations.
  • Familiarity with observability tools and building SRE communities of practice.

Why Join?

  • Be a Pioneer: Lead the charge in transforming how reliability engineering is approached in the banking sector.
  • Collaborative Environment: Work with a diverse team that values innovation, teamwork, and excellence.
  • Professional Growth: Take on a pivotal role that will challenge and expand your skills in a dynamic and fast-paced industry.

At mthree, our values support courageous teammates, needle movers, and learning champions all while striving to support the health and well-being of all employees.  We take great pride in celebrating the diversity of each individual who contributes to making mthree the company it is today and will be in the future. We value diversity both within mthree and with our partner companies, and we're proud to provide an environment where all our colleagues can flourish. That means promoting a strong culture of equality but, most importantly, inclusion.

We are committed to fair, transparent pay, and we strive to provide competitive compensation in addition to a comprehensive benefits package.  The base pay rate for this position is $140,000 - 170,000 USD. 

This pay rate represents mthree's good faith and reasonable estimate of the base pay for this role at the time of posting and based on the locations listed in the job advertisement. It is anticipated that qualified candidates selected for a placement will receive this pay rate as a starting salary once onsite with the mthree client, however, the ultimate salary offered for this role may be higher or lower and will be set based on a variety of non-discriminatory factors, including but not limited to, geographic location, skills, and competencies.

Applicants must be currently authorized to work in the United States on a full-time basis. The Company will not sponsor applicants for work visas.

mthree

About mthree

mthree helps organisations succeed by building job-ready teams with the most in-demand skills.

We bridge the skills gap at every level in technology, business and banking. Whether we’re deploying trained emerging talent and seasoned experts or reskilling existing employees, we provide the people and skills you need across the globe.

We offer new ways to create high performance teams – complementing traditional strategies like recruitment, internal graduate programmes, and the big consultancies.

Industry
IT & Software
Company Size
201-500 employees
Headquarters
London, GB
Year Founded
2010
Social Media