Spatial Front, Inc. (SFI)—a two-time USA Today Top Workplaces awardee and Washington Post Top Workplaces honoree—is seeking an IT Operations Lead to join our growing team. The ideal candidate will lead an IT operations team responsible for the administration, monitoring, scheduling, execution, and support of enterprise batch processing and automated workflows across a complex PeopleSoft platform in a federal environment.
This role oversees production and test operations, operational analysts, and day-to-day support activities involving Stonebranch, Ab Initio Control Center, Phire, GoAnywhere, Windows/RDP, UNIX, VDI, and Azure DevOps Boards (ADO). The ideal candidate will combine strong operational leadership with hands-on understanding of workload automation, incident response, runbooks and SOPs, release support, and cross-team coordination to maintain reliable business processing and system interfaces.
Work Location: Hybrid, On-site - Arlington, VA
Key Responsibilities
- Lead and supervise Operational Analysts responsible for enterprise batch processing, workload automation, job scheduling, monitoring, execution support, and day-to-day operational activities across production and test environments.
- Oversee the configuration, scheduling, maintenance, and governance of batch jobs, workflows, calendars, triggers, dependencies, and execution windows in Stonebranch and related tools to ensure alignment with business processing requirements and service level agreements (SLAs).
- Direct 24x7 operational monitoring of workflows, jobs, alerts, file transfers, and system health as required; ensure timely investigation and resolution of job failures, delays, missed dependencies, interface issues, and performance degradation.
- Coordinate production releases and test support activities involving new workflows, interface changes, configuration changes, and operational requirements, working closely with multiple ARTs, infrastructure teams, PeopleSoft administrators, developers, and trading partners.
- Lead incident response, outage communications, escalation management, and recovery activities, including execution oversight for approved rollback, restart, and recovery procedures to restore service and minimize business disruption.
- Develop, maintain, and enforce operational runbooks, SOPs, scheduling standards, naming conventions, workflow documentation, and knowledge articles related to batch execution, interfaces, and operational procedures.
- Track processing trends, recurring issues, SLA performance, and operational metrics; produce reports on job execution, failures, audit history, incident patterns, and service health, and drive continuous improvement based on findings.
- Ensure adherence to organizational security policies, operational controls, audit requirements, and change management processes, and participate in major incident reviews, release readiness assessments, and post-incident lessons learned.