
Senior Software Engineer / Site Reliability Engineer (SRE) – Observability & Platform Engineering1
Must-Have Skills (Required)
Core Engineering & Platform Skills
Strong proficiency in at least one of the following: Python, JavaScript (Node.js), or Java
Hands-on experience with API integrations (designing, consuming, and integrating APIs)
Strong experience working in Kubernetes environments, including deployment, operations, and monitoring
Observability & Monitoring
Experience with DataDog (preferred) or similar tools such as Prometheus, Grafana
Ability to configure dashboards, alerts, and APM (tracing, metrics, logging)
Experience monitoring containerized and microservices architectures
Cloud & Infrastructure
Hands-on experience with AWS
Experience integrating observability tools into cloud environments
SRE & Operations
Experience with CI/CD integrations for observability (e.g., DataDog in pipelines)
Ability to automate monitoring and operational tasks using scripting (Python preferred)
Strongly Preferred Skills
Experience owning and operating an internal engineering platform
Deep experience with observability platforms
Demonstrated ownership of reliability, scalability, and performance
Proven ability to proactively lead maintenance efforts and platform improvements
Experience installing and configuring DataDog agents and integrations
Experience managing API keys and secure configurations
Experience managing user roles and access controls within observability platforms
Nice-to-Have Skills (Preferred)
Familiarity with Go (Golang)
Experience with additional observability tools such as New Relic, Dynatrace, Elastic, or Splunk Observability
Project Overview:
We are seeking a Senior Software Engineer / SRE with an Observability focus to support platform reliability, monitoring, and modernization initiatives. This role blends software engineering (60–70%) with site reliability engineering (30–40%), with a strong emphasis on Kubernetes and observability platforms.
Key Responsibilities
Support platform reliability, monitoring, and modernization initiatives
Provide operational and training support for DataDog, the Observability Platform for R&D
Enhance observability, reliability, and performance across engineering platforms
Drive automation and operational excellence for monitoring and alerting frameworks
Support Kubernetes-based platform operations and monitoring integrations
Timezone Coverage
PST Coverage Required

Jade Global is a premier consulting, integration, and managed services partner helping enterprises modernize, innovate, and scale. Founded in 2003, we bring over two decades of engineering excellence, with 2,000+ professionals and 11 global offices, and have served 500+ clients across North America, Europe, and APAC. We are a Great Place to Work–certified organization and have been recognized by Inc. 5000 as a high-growth company for 13 years in a row.
Known for Delivering Innovation and driving impact, we offer holistic cloud transformation, ERP and CRM modernization, data and analytics, integration, AI-powered automation, and AI-led managed services. Jade offers the perfect blend of agile client-centricity along with a rich ISV partner ecosystem, including Oracle, Salesforce, SAP, ServiceNow, Workday, Snowflake, Boomi, and many others.
With our AI-first approach, powered by 220+ enterprise-ready AI Agents and industry accelerators, we drive data readiness, autonomous workflows, intelligent operations, and faster transformation outcomes. The result is lower costs, greater efficiency, and measurable business value for enterprises. We enable organizations across high-tech, healthcare, life sciences, manufacturing, financial services, and retail to achieve resilient, future-ready operations.