Job Description
We are seeking a Senior Operations & Maintenance Support Specialist to provide advanced technical support with specialized focus on Kubernetes and Virtual Desktop Infrastructure (VDI) in a government environment. This role combines Tier 1 and Tier 2 Site Reliability Engineering (SRE) support responsibilities with deep technical expertise in Azure Kubernetes Services and VDI technologies. The ideal candidate will serve as a technical escalation point, mentor junior team members, and provide expert-level troubleshooting for complex cloud infrastructure and application issues while ensuring system reliability and optimal user experiences. Must be a US citizen and an active TS/SCI security clearance is required for this position.
Sign-on Bonus available
Responsibilities include:
• Serve as first point of contact for end-user technical support requests related to cloud services, hosted applications, and VDI functionality, providing responsive customer service and executing deep troubleshooting
• Create, update, and maintain technical support tickets in Jira with accurate documentation, proper categorization, and tracking of resolution times to maintain SLA compliance
• Escalate complex issues to Tier 2/3 support teams, SREs, and Platform Engineers with comprehensive documentation, coordinating with Development, DevOps, and Architecture teams for resolution
• Manage VDI technologies including troubleshooting and repairing User Virtual Machines (UVMs), building and deploying base images for supported operating systems, and patching running UVMs
• Build, package, deploy, and manage Salt States to install applications in virtual machines, operating the Salt Master for application deployment across the VDI environment
• Monitor system performance, identify patterns indicating systemic issues, and support maintenance windows with user communication during planned outages
• Maintain and update knowledge base articles, contribute to support procedures and troubleshooting guides, and participate in continuous service improvement initiatives
• Work with cloud providers for Tier 1 and Tier 2 support issues, escalating Tier 3+ issues appropriately
• Other duties as assigned
Technical Experience & Qualifications
• Must be US Citizen
• Education: Bachelor's degree in Computer Science, Information Technology, or related technical field
• Experience: Minimum 6 years of professional experience in technical support, operations, SRE, or infrastructure engineering roles
• Security Clearance: Active TS/SCI clearance (Required)
• Kubernetes Expertise: Advanced hands-on experience with Kubernetes or Azure Kubernetes Services (AKS) including troubleshooting pods, services, deployments, networking, and cluster operations
• VDI Expertise: Deep experience with Virtual Desktop Infrastructure technologies, particularly Azure Virtual Desktop (AVD), including architecture, troubleshooting, and optimization
• Configuration Management: Advanced proficiency with Salt, Ansible, or similar configuration management tools with experience developing complex automation
• Containerization: Strong experience with Docker, container troubleshooting, and containerized application support
• SRE Practices: Understanding of Site Reliability Engineering principles including monitoring, incident response, and system reliability
• Cloud Platforms: Advanced knowledge of Azure cloud services with focus on compute, networking, and storage
• Operating Systems: Expert-level troubleshooting in Windows and Linux environments
• Monitoring & Observability: Experience with Azure Monitor, Prometheus, Grafana, or similar monitoring tools for Kubernetes and infrastructure
• Scripting & Automation: Proficiency in PowerShell, Python, or Bash for automation and troubleshooting
• Incident Management: Advanced experience with Jira or similar systems, including leading complex incident resolution
• Technical Leadership: Proven ability to mentor team members and provide technical guidance