iLink Digital

Senior Site Reliability Engineer

iLink Digital  •  Chennai, IN (Onsite)  •  26 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description


About
The Company:


iLink Digital is a
Global Software Solution Provider and Systems
Integrator, delivers next-generation technology
solutions to help clients solve complex business
challenges, improve organizational effectiveness,
increase business productivity, realize sustainable
enterprise value and transform your business
inside-out. iLink integrates software systems and develops
custom applications, components, and frameworks on the latest
platforms for IT departments, commercial accounts, application
services providers (ASP) and independent software vendors
(ISV). iLink solutions are used in a broad range of industries
and functions, including healthcare, telecom, government, oil
and gas, education, and life sciences. iLink’s expertise
includes Cloud Computing & Application Modernization, Data
Management & Analytics, Enterprise Mobility, Portal,
collaboration & Social Employee Engagement, Embedded
Systems and User Experience design
etc.


What makes
iLink's offerings unique is the fact that we use
pre-created frameworks, designed to accelerate software
development and implementation of business processes for our
clients. iLink has over 60 frameworks (solution accelerators),
both industry-specific and horizontal, that can be easily
customized and enhanced to meet your current business
challenges.


Requirements


  • 6–10 years of experience in SRE, DevOps,
    infrastructure and production support engineering roles.

  • Proven experience managing multi-cloud
    environments (AWS + Azure).

  • Demonstrated experience handling P1/P2 production
    incidents in cloud environments.

  • Familiarity with Prometheus, Grafana, Datadog, or
    Splunk

  • Design, deploy, and manage Kubernetes clusters
    for production workloads at scale.

  • Architect and maintain PostgreSQL databases —
    performance tuning, HA setup, backup/restore strategies.

  • Build and manage cloud infrastructure on AWS and
    Azure using Terraform and Ansible.

  • Lead vulnerability management programs —
    identify, prioritize, and remediate security risks across the stack.

  • Define and enforce SLOs, SLIs, and error budgets;
    drive reliability improvements across services.

  • Implement IaC best practices, automate
    provisioning pipelines, and reduce manual toil.

  • Collaborate with development teams on capacity
    planning, disaster recovery, and incident post-mortems.

  • Build and maintain monitoring, alerting, and
    observability frameworks (Prometheus, Grafana, ELK, etc.).

  • Lead end-to-end incident management — detection,
    triage, escalation, resolution, and communication.

  • Serve as an on-call engineer; manage and respond
    to alerts and production incidents effectively.

  • Conduct blameless post-mortems and implement
    action items to prevent recurrence.

  • Monitor system health using dashboards and
    alerting tools; proactively identify degradation risks.

  • Collaborate with Dev, QA, and infrastructure
    teams to identify and reduce toil and failure points.

  • Support Kubernetes workloads and assist in
    troubleshooting cluster-level issues.

  • Work across AWS and Azure environments for
    incident containment and recovery.

  • Maintain and improve runbooks, playbooks, and
    incident response documentation.

  • Strong understanding of networking, security, and
    distributed systems.

  • Excellent communication skills for cross-team
    collaboration and post-mortem documentation.

  • Experience with Helm, ArgoCD, or GitOps
    workflows.


Benefits


  • Competitive
    salaries

  • Medical
    Insurance

  • Employee
    Referral Bonuses

  • Performance
    Based Bonuses

  • Flexible
    Work Options & Fun Culture

  • Robust
    Learning & Development Programs

  • In-House
    Technology Training
iLink Digital

About iLink Digital

Established in 2002, iLink Digital, an ISO 9001 and CMMI L3 accredited company is a global leader in digital transformation. We specialize in Data Engineering, Generative AI, Cloud Operations, Business Applications, and RPA consulting. With over two decades of IT excellence, our 2,500 experts across 18 offices serve Fortune 1000 clients worldwide. We proudly partner with industry leaders like Microsoft, Salesforce, AWS, UiPath, Outsystems, Databricks, and Confluent, earning multiple Microsoft Partner of the Year awards. iLink solutions are used in a broad range of industries and functions, including healthcare, telecom, government, oil and gas, education, and life sciences. iLink’s expertise includes Cloud Computing & Application Modernization, Data Management & Analytics, Enterprise Mobility, Portal, collaboration & Social Employee Engagement, Embedded Systems User Experience design, etc.

What makes iLink Digital's offerings unique is the fact that we use pre-created frameworks, designed to accelerate software development and implementation of business processes for our clients. iLink has over 60 frameworks (solution accelerators), both industry-specific and horizontal, that can be easily customized and enhanced to meet your current business challenges.

Industry
IT & Software
Company Size
1,001-5,000 employees
Headquarters
Bothell, Washington
Year Founded
2002
Social Media