
We are seeking a skilled AWS Cloud Infrastructure Support Engineer (L1/L2) to join our Managed Services team. The role involves providing 24x7 operational support for cloud-hosted environments, ensuring high availability, performance, and security of AWS infrastructure for multiple clients. The candidate will be responsible for incident resolution, monitoring, troubleshooting, and escalation support within defined SLAs.
Monitor AWS cloud infrastructure using monitoring tools (CloudWatch, Datadog, etc.)
Respond to alerts, incidents, and service requests in a 24x7 shift model
Perform initial triage and categorize incidents (severity assessment)
Execute predefined runbooks and operational playbooks
Handle basic troubleshooting of:
EC2 instance health
EBS volume status
ELB / ALB connectivity issues
RDS basic checks (CPU, storage, availability)
IAM access issues (basic validation)
Escalate complex issues to L2/L3 teams with proper documentation
Maintain incident logs and update ticketing systems (ServiceNow/Jira)
Ensure SLA adherence and timely communication updates
Perform deep-dive troubleshooting of AWS infrastructure issues
Analyze root cause of incidents and provide resolution or workarounds
Manage and support AWS services including:
EC2, S3, VPC, IAM, RDS, CloudFront, Route 53, Lambda (basic–intermediate)
Handle performance issues, scaling concerns, and service degradation
Support deployment validation and post-deployment checks
Assist in automation of routine tasks using scripts (Shell, Python, AWS CLI)
Work closely with DevOps and Engineering teams for problem resolution
Participate in RCA (Root Cause Analysis) documentation
Improve operational runbooks and incident response processes
2–4 years of experience in AWS cloud infrastructure support (L1/L2 roles)
Strong understanding of AWS core services (EC2, S3, VPC, IAM, RDS)
Experience with monitoring tools (CloudWatch, Datadog, Nagios, etc.)
Hands-on experience with Linux/Unix systems administration
Basic scripting knowledge (Shell / Python preferred)
Experience with ITSM tools (ServiceNow, Jira, etc.)
Understanding of networking concepts (DNS, TCP/IP, Load Balancing)
Ability to work in 24x7 rotational shifts
AWS certifications (AWS Cloud Practitioner / Solutions Architect Associate)
Experience in MSP or multi-client environments
Exposure to Infrastructure as Code (Terraform, CloudFormation)
Knowledge of CI/CD pipelines
Basic security best practices in AWS environments
Strong problem-solving and analytical skills
Ability to work under pressure in production environments
Good communication skills for incident reporting and escalation
Team collaboration in a shift-based environment
Ownership mindset for incident resolution
24x7 rotational shifts (including nights, weekends, and holidays)
Production support for multiple enterprise AWS environments
High focus on SLA-driven service delivery

NorthBay is AWS Premier Consulting Partner and also partnered with AI21 Labs, VMware, CloudRail and SAP in support of our Customers’ AWS cloud journeys. NorthBay helps companies transform their business by unlocking the value of their data in the cloud so they can gain agility and speed in their decision making and innovation.
Our specialities include:
- Generative AI
- Managed Service Provider
- Cloud Migration and Modernization Services
- Cloud Application Development
- Data Lake/Data Warehouse
- Machine Learning & AI
- DevOps Enablement
- Staff Augmentation
- Performance & Optimization.