Job Description
Meta is seeking a Staff Systems Engineer to design and build the foundational software infrastructure that powers products used by billions of people worldwide. In this role, you will architect and own large-scale distributed systems, low-level platform software, and critical infrastructure components that underpin Meta's family of applications and services. You will drive technical strategy across system reliability, performance, and scalability, partnering closely with product engineering, infrastructure, and operations teams to deliver systems that are resilient, efficient, and built to evolve. This is a high-impact opportunity to shape the systems engineering culture and technical direction at one of the world's most complex software organizations.
Responsibilities
Architect and own large-scale distributed systems and platform infrastructure components, driving end-to-end technical design from requirements through production
* Lead the design and implementation of high-performance, low-latency systems with well-defined service level objectives, dashboards, and incident response runbooks
* Identify systemic reliability risks, reduce failure surface across service dependencies, and drive resiliency testing including overload and outage scenarios
* Establish and enforce systems engineering best practices across the team, including safe rollout strategies, feature flagging, staged releases, and automated deployment pipelines
* Use instrumentation and profiling to identify performance bottlenecks, establish visibility into key system metrics, and drive measurable improvements in throughput and latency
* Collaborate with cross-functional partners across product engineering, data infrastructure, and operations to align on technical requirements and deliver scalable platform solutions
* Proactively incorporate privacy, security, and integrity principles into system design at early engineering stages, partnering with relevant teams to apply appropriate safeguards
* Mentor other engineers on systems design patterns, debugging methodologies, and AI-accelerated development workflows, and contribute to onboarding and engineering programs
* Drive roadmapping and technical strategy for one or more platform areas, communicating trade-offs and architectural decisions clearly to both engineering and non-engineering stakeholders
* Leverage AI tools and automation to accelerate development velocity, reduce toil, and improve the reliability and observability of owned systems
Qualifications
Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
* 8+ years of experience designing and implementing large-scale distributed systems or platform infrastructure software
* Experience owning system reliability end-to-end, including defining service level objectives, building observability tooling, and leading incident response and retrospectives
* Experience leading technical design of complex systems, including evaluating architectural trade-offs and driving cross-team alignment on implementation decisions
* Experience with performance profiling, capacity planning, and optimization of high-throughput or low-latency systems
* Track record of successfully delivering major infrastructure initiatives, including coordinating rollouts, migrations, and dependency management across multiple teams Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
* Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
* Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
* Experience with systems programming in C, C++, Rust, or similar low-level languages in a production environment at scale
* Background in contributing to or defining organization-wide systems engineering standards, coding guidelines, or reliability frameworks
* Experience applying AI-assisted development workflows to systems engineering problems, including code generation, anomaly detection, or automated root cause analysis
* Demonstrated ability to build and improve developer tooling, automation frameworks, or internal platform abstractions that measurably improve engineering efficiency across teams