
Solvd Inc. is a rapidly growing AI-native consulting and technology services firm delivering enterprise transformation across cloud, data, software engineering, and artificial intelligence We work with industry-leading organizations to design, build, and operationalize technology solutions that drive measurable business outcomes.
Following the acquisition of Tooploox, a premier AI and product development company, Solvd now offers true end-to-end delivery—from strategic advisory and solution design to custom AI development and enterprise-scale implementation. Our capability centers combine deep technical expertise, proven delivery methodologies, and sector-specific knowledge to address complex business challenges quickly and effectively.
We are looking for a talented Infrastructure / Site Reliability Engineer (SRE) to join our engineering team. In this role, you will be the driving force behind our cloud infrastructure scalability, reliability, and deployment automation.
You are an engineer who views infrastructure as a software problem. Instead of manually configuring servers, you build automated pipelines, treat Infrastructure as Code (IaC) as a religion, and architect self-healing cloud deployments. You will collaborate closely with development teams to bridge the gap between code generation and production stability.
Cloud Architecture & Infrastructure as Code (IaC)
Cloud Management: Design, provision, and maintain secure, scalable, and highly available cloud infrastructure (primarily AWS, GCP, or Azure).
Immutable Infrastructure: Write and maintain modular, clean Terraform or OpenTofu scripts to ensure all infrastructure is fully auditable and reproducible.
Container Orchestration: Manage and optimize containerized environments using Docker and Kubernetes (EKS/GKE), focusing on resource allocation and scaling policies.
Automation & CI/CD Pipelines
Deployment Automation: Build, maintain, and secure robust CI/CD pipelines (e.g., GitHub Actions, GitLab CI, Jenkins) to support zero-downtime deployments.
GitOps & Tooling: Implement modern GitOps workflows (e.g., ArgoCD, Flux) to automate application delivery and configuration management.
Scripting: Develop custom internal tools and automation scripts using Python, Go, or Bash to eliminate toil and repetitive manual tasks.
Observability & Reliability Engineering
Monitoring & Alerting: Design and implement comprehensive observability stacks using tools like Prometheus, Grafana, Datadog, or New Relic.
Performance Tuning: Conduct chaos engineering, load testing, and bottleneck analysis to ensure system resilience under heavy traffic.
On-Call & Incident Response: Participate in an engineering on-call rotation, driving root-cause analysis (Blameless Post-Mortems) to prevent incident recurrence.
Experience: 3+ years of experience in an SRE, DevOps, or Cloud Infrastructure role.
Cloud Proficiency: Deep production experience with at least one major cloud provider (AWS, GCP, or Azure).
IaC & Containers: Strong proficiency with Terraform and hands-on experience managing production Kubernetes clusters.
Linux Systems: Solid understanding of Linux networking, internals, storage, and security fundamentals.
Languages: Strong coding skills in Go or Python.
Networking: Good grasp of VPC architecture, DNS, load balancers (ALB/NLB), and Content Delivery Networks (CDNs).
Data Layer: Familiarity with managing cloud-native databases (PostgreSQL, RDS) and caching layers (Redis, Memcached).
When you join Solvd, you'll…
Shape real-world AI-driven projects across key industries, working with clients from startup innovation to enterprise transformation.
Be part of a global team with equal opportunities for collaboration across continents and cultures.
Thrive in an inclusive environment that prioritizes continuous learning, innovation, and ethical AI standards.
Ready to make an impact?
If you're excited to build things that matter, champion responsible AI, and grow with some of the industry’s sharpest minds. Apply today and let’s innovate together.
Solvd is an equal opportunity employer.
I agree to the processing of my personal data given in the recruitment process by Solvd Inc., with its principal place of business at 1646 N California Blvd, Suite 515, Walnut Creek, CA 94596, United States, for the purpose of future recruitment processes.
You can withdraw your consent at any time, however it will not affect the lawfulness of the processing performed on this basis prior to such withdrawal.
The controller of your personal data is Solvd Inc., with its principal place of business at 1646 N California Blvd, Suite 515, Walnut Creek, CA 94596, United States. More information on processing your personal data you can find in the Privacy Policy.

Solvd is an AI-first advisory and digital engineering firm delivering measurable impact through strategic digital transformation. We help enterprises close the gap between experimentation and execution that delivers a real ROI by embedding artificial intelligence into every process layer and moving ideas rapidly from research to production.
Our unique AI capabilities combine deep implementation experience with world-class academic research. The strength of our team lies in its blend of practitioners solving real life problems and researchers, many with advanced degrees and contributions to leading conferences such as NeurIPS, ICML, and ECCV. This dual expertise ensures technical rigor and real-world execution power.
As an AI-native company, we bring continuous innovation with global scale and enterprise-grade delivery. Our services span AI advisory, AI and data engineering, digital experience, application development, cloud engineering, and quality engineering & GRC. Across industries such as healthcare, life sciences, media, and retail, Solvd helps enterprises worldwide turn AI potential into real ROI.