Job Description

Our Purpose

Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.

Title and Summary

Lead Site Reliability Engineer
The role of Business Operations Organization is to be the production readiness steward for Mastercard products. As a Business Operations we are responsible for ensuring that our platform is stable and healthy. We break down barriers to run our products by fostering developer run ownership and empowering developers to build resilient products. We support our developers during the application build phase in software run principals that includes operational design, automation, capacity planning, monitoring that leads to fault-tolerant, scalable products. We see the big picture and help create and enforce operations standards while facilitating an agile and learning culture.
We accomplish this transformation through supporting daily operations with a hyper focus on triage and then root cause by understanding the business impact of our products. The goal of every biz ops team is to shift left to be more proactive and upfront in the development process, and to proactively manage production and change activities to maximize customer experience and increase the overall value of supported applications. Biz Ops teams also focus on risk management by tying all our activities together with an overarching responsibility for compliance and risk mitigation across all our environments. A biz ops focus is also on streamlining and standardizing traditional application specific support activities and centralizing points of interaction for both internal and external partners by communicating effectively with all key stakeholders.

Ultimately, the role of biz ops is to align Product and Customer Focused priorities with Operational needs. We regularly review our run state not only from an internal perspective but also understanding and providing the feedback loop to our development partners on how we can improve the customer experience of our applications.

Key Responsibilities
• Lead and own the full lifecycle of services—from architecture and design through deployment, operations, and continuous optimization—ensuring scalability, reliability, and alignment with business objectives.
• Analyze platform-level ITSM performance and proactively establish feedback loops with engineering teams, influencing roadmap prioritization to address systemic gaps and improve resiliency.
• Define and drive production readiness standards, including operational design reviews, capacity planning, and launch governance, ensuring services meet reliability and scalability benchmarks before go-live.
• Define and evolve monitoring frameworks for availability, latency, and system health, leveraging metrics and telemetry to proactively prevent incidents and improve service performance.
• Champion automation-first principles to scale systems efficiently, reducing manual toil while improving deployment velocity and overall system reliability.
• Lead the design and governance of CI/CD pipelines, implementing robust validation, operational gates, and best practices to drive consistency, quality, and speed across environments.
• Drive best-in-class incident response practices, including rapid mitigation, stakeholder communication, and blameless postmortems, ensuring continuous improvement and resilience.
• Take a holistic, system-wide approach during critical incidents, connect
• Collaborate effectively across distributed, global teams, ensuring alignment, continuity, and high performance across time zones and technology hubs.
• Act as a technical leader and mentor, developing junior engineers, promoting best practices, and raising the overall bar for engineering excellence within the organization.

All about you
• Bachelor’s degree in computer science, Engineering, or a related technical field (e.g., Physics, Mathematics), or equivalent practical experience.
• 8–15 years of relevant experience in Site Reliability Engineering, Infrastructure, or DevOps roles, with a combination of hands-on technical expertise and early leadership responsibilities.
• Strong technical foundation across enterprise platforms, Linux/UNIX systems, operating systems, and database environments (Oracle/SQL, DBA), with the ability to provide technical guidance and support to the team.
• Experience with observability and monitoring tools (e.g., Splunk, Dynatrace), driving improved system visibility, performance, and reliability.
• Solid experience in DevOps and CI/CD practices, with the ability to support and guide automation, deployment pipelines, and operational improvements.
• Proficiency in one or more programming or scripting languages such as Python, Java, Go, C/C++, Perl, or Ruby, with practical application in automation or system
• Strong foundation in Security and/or Enterprise Monitoring environments, with exposure to coding and system-level design.
• Experience designing, analyzing, and troubleshooting large-scale distributed systems, with a strong focus on reliability, scalability, and performance optimization.
• Strong program management capabilities, with a track record of successfully leading large-scale, cross-functional initiatives from concept through execution.
• Extensive experience working across development, operations, and product teams to prioritize initiatives, build strong partnerships, and deliver end-to-end solutions.
• Practical knowledge of cloud platforms, preferably AWS, with familiarity in cloud-native architectures and operational best practices.
• Ability to critically assess existing processes and challenge the status quo, identifying opportunities to improve efficiency, scalability, and overall business impact.

We are seeking site reliability engineers with an appetite for change and who can push the boundaries of what can be completed through automation, while managing service levels for some of Mastercard’s most critical security services.

Corporate Security Responsibility

All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:

Abide by Mastercard’s security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.

About Mastercard

Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re building a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.

Industry

IT & Software

Company Size

10,000+ employees

Headquarters

Purchase, NY

Year Founded

Unknown

Website

mastercard.com

Social Media