Job Description
Join our winning team as a Senior Systems Engineer
We are seeking a skilled Systems Engineer to join our Enterprise Systems Engineering team. The team is highly skilled and collaborative, diverse, and geographically distributed. Our mission is to build, administer, secure, and continuously improve thousands of servers and systems across the entire enterprise while delivering new solutions that align with strategic business objectives.
The ideal candidate will demonstrate a proven track record in large-scale server administration, possessing broad expertise in configuring and managing on-premises and multi-cloud environments. Experience migrating on-prem systems to the cloud, with a strong hands-on background with containerization, Infrastructure as Code, automation, and networking are essential.
This is a hybrid role with in-office expectations determined by location and subject to change based on evolving business needs.
What you'll be doing:
- Build, install, and administer on-premises systems, servers, hardware, and storage in accordance with security standards and operational requirements.
- Architect, integrate, and administer enterprise cloud and hybrid infrastructure solutions, ensuring alignment with business requirements for reliability and disaster recovery.
- Establish and promote best practices, standards, training, and general support for both on-premises and cloud environments.
- Continuously analyze and develop solutions to improve platform performance, reliability, scalability, and cost efficiency across applications and processes.
- Lead design decisions, including researching and prototyping emerging technologies.
- Champion exceptional containerization and cloud platform design and quality.
- Build and maintain monitoring systems and dashboards; develop tools and scripts to automate and streamline server and container administration.
- Provide operational support through on-call rotations, ensuring high availability and timely incident resolution, including root cause analysis.
What we're looking for:
- 7+ years of experience building, hardening, and administering Linux in a production environment.
- 5+ years of hands-on experience with Kubernetes clusters, including setup, configuration, monitoring, and troubleshooting (AWS EKS experience is a plus).
- 5+ years of experience designing, supporting, and migrating on-premises resources to AWS cloud infrastructure, with hands-on experience across core services including EC2, VPC, API Gateway, and related CLI and SDK tooling.
- 3+ years of hands-on experience with configuration management and Infrastructure as Code tools such as Chef, Ansible, Puppet, Terraform, and AWS CDK.
- Proficiency in Python, YAML, Bash, PowerShell, or other scripting and object-oriented languages.
- Broad expertise in DevSecOps, encompassing security and performance best practices for on-premises, cloud, and hybrid architectures, with a demonstrated commitment to continuous learning, documentation, and mentoring.
- Operational ownership of system uptime and quality, including advanced troubleshooting and root cause analysis across diverse systems and technologies.
- Experience with observability tools such as Prometheus, Grafana, and New Relic.
- Experience with Identity and Access Management (IAM) and Privileged Access Management (PAM) solutions.
Preferred Qualifications
- Multi-cloud experience spanning AWS, GCP, and Azure.
- Familiarity with F5, Gloo, Lambda, VPC, load balancing, and diagnosing server- and network-related issues.
- Site Reliability Engineering (SRE) experience.
- Experience with Active Directory and file systems management.
- Experience with TLS, digital certificates, and related troubleshooting.
- Familiarity with CloudWatch, Splunk, Logstash, or other log aggregation tools.