WorldQuant

Senior Site Reliability Engineer

WorldQuant  •  Ho Chi Minh City, VN / Hanoi, VN (Onsite)  •  2 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

WorldQuant develops and deploys systematic financial strategies across a broad range of asset classes and global markets. We seek to produce high-quality predictive signals (alphas) through our proprietary research platform to employ financial strategies focused on market inefficiencies. Our teams work collaboratively to drive the production of alphas and financial strategies – the foundation of a balanced, global investment platform.

WorldQuant is built on a culture that pairs academic sensibility with accountability for results. Employees are encouraged to think openly about problems, balancing intellectualism and practicality. Excellent ideas come from anyone, anywhere. Employees are encouraged to challenge conventional thinking and possess an attitude of continuous improvement.

Our goal is to hire the best and the brightest. We value intellectual horsepower first and foremost, and people who demonstrate an outstanding talent. There is no roadmap to future success, so we need people who can help us build it.

Technologists at WorldQuant research, design, code, test and deploy firmwide platforms and tooling while working collaboratively with researchers. Our environment is relaxed yet intellectually driven. We seek people who think in code and are motivated by being around like-minded people.

The Role: We're seeking a Senior Site Reliability Engineer to join the team. You will build and operate the infrastructure and tooling behind WorldQuant's data ingestion pipelines — systems that onboard, validate, and deliver large-scale datasets to the firm's research platform. This is a 70% build / 30% operate role. You'll spend most of your time engineering automation, observability, and developer tooling, while also participating in on-call rotations and incident response for production data pipelines. You'll partner with engineering, analyst, and research teams to ensure reliability at scale — this requires excellent analytical skills, clear communication, and the ability to collaborate across teams.

What You'll Do:

Build (70%):

  • Design and develop automation, monitoring, CI/CD, and reliability features for the data onboarding pipeline
  • Develop and maintain internal infrastructure and services that reduce toil and improve pipeline reliability
  • Build observability solutions — dashboards, alerting, log aggregation — using Grafana, the ELK stack, and Vector
  • Design and implement CI/CD pipelines, test automation, and release management workflowsWrite infrastructure-as-code for provisioning, scaling, and managing platform components: Kubernetes, bare metal hosts
  • Integrate and extend tools such as Redis, Celery, MySQL

Operate (30%) Keep production data pipelines healthy and respond to incidents

  • Participate in on-call rotation, respond to production incidents, and drive post-mortems
  • Define and track SLOs/SLIs for pipeline reliability, latency, and data freshness
  • Diagnose platform performance and reliability issues, driving them to root cause
  • Create and maintain runbooks for common operational scenarios
  • Plan capacity and optimize resource utilization

What You'll Bring

  • 8+ years of experience in SRE, DevOps, or platform engineering roles
  • Linux expertise Power user proficiency in Linux with ability to manage infrastructure, deploy services, and troubleshoot production systems
  • Python proficiency Strong scripting and automation skills; experience building CLI tools, API clients, monitoring integrations, and operational tooling in Python
  • Kubernetes & containers Deep hands-on experience with Kubernetes — deploying, scaling, debugging, and managing production workloads. Familiarity with Helm, resource management. Solid experience with Docker
  • Observability Hands-on experience with monitoring stacks — Grafana, Prometheus, ELK (Elasticsearch, Logstash, Kibana), or similar. Experience designing dashboards, alerts, and SLO-based reliability tracking
  • CI/CD & infrastructure-as-code Experience designing and maintaining CI/CD pipelines (GitLab CI, or similar), including test automation and release management. Familiarity with Ansible or similar IaC tools
  • Databases Working knowledge of relational databases (MySQL/PostgreSQL), query tuning, and operational database management
  • Message queues & streaming Experience with Kafka, Redis pub/sub, or Celery for event-driven architectures
  • Networking & APIs Understanding of network fundamentals, DNS, load balancing, and REST/gRPC APIs
  • Incident management Experience with on-call rotations, incident response, post-mortems, and runbook-driven operations
  • Leadership & management Proven track record of leading a team — mentoring engineers, driving technical roadmaps, coordinating cross-team initiatives, and managing priorities. Comfortable owning team delivery and representing the team to stakeholders
  • AI-agent readiness Openness to working alongside AI coding agents and LLM-powered tools as part of the development and operations workflow — treating AI as a force multiplier for automation, incident analysis, and toil reduction
  • Nice to Have:
    • Cloud platforms Exposure to GCP or AWS for compute, storage, and managed services
    • Data tools Familiarity with Apache Arrow, gRPC, or columnar data formats
    • Big data platforms Familiarity with Hadoop or Apache Spark for large-scale data processing
    • Programming languages C/C++, Golang, Scala, JavaScript
    • Financial services or data-intensive industry background
    • SRE culture Familiarity with Google's SRE book principles — error budgets, toil tracking, blameless post-mortems

What We Offer:

  • Competitive and attractive compensation package with clear career road-map – where you feel challenged everyday
  • We offer a strong culture of learning and development: training courses, library, speakers, share and learn events
  • Learn from who sits next to you! Working in WQ you are surrounded by smart and talented people
  • Premium Health Insurance and Employee Assistance Program
  • Generous time-off policy, re-creation sabbatical leave (based on tenure), Trade Union benefits for staff and family
  • Team building activities every month: Local engagement events, Employee clubs: football, ping-pong, badminton, yoga, running, PS5, movies, etc.
  • Annual company trip and occasional global conferences – opportunity to travel and connect with our global teams
  • Happy-hour with tea break, snacks and meals every day in the office!

#LI-QM1

By submitting this application, you acknowledge and consent to terms of the WorldQuant Privacy Policy. The privacy policy offers an explanation of how and why your data will be collected, how it will be used and disclosed, how it will be retained and secured, and what legal rights are associated with that data (including the rights of access, correction, and deletion). The policy also describes legal and contractual limitations on these rights. The specific rights and obligations of individuals living and working in different areas may vary by jurisdiction.

Copyright © 2025 WorldQuant, LLC. All Rights Reserved.
WorldQuant is an equal opportunity employer and does not discriminate in hiring on the basis of race, color, creed, religion, sex, sexual orientation or preference, age, marital status, citizenship, national origin, disability, military status, genetic predisposition or carrier status, or any other protected characteristic as established by applicable law.

WorldQuant

About WorldQuant

WorldQuant is a global quantitative asset management firm with over $7 billion in assets under management. Founded in 2007 by Igor Tulchinsky with the belief that talent is global, but opportunity is not, WorldQuant has more than 1,000 employees spread among 27 global offices. WorldQuant seeks to get to the future faster, guided by the principle that there are an infinite number of insights to discover. The firm develops and deploys investment strategies across a variety of asset classes in global markets. For more information on WorldQuant’s philosophy and culture, please visit www.worldquant.com.

Industry
Finance & Insurance
Company Size
1,001-5,000 employees
Headquarters
Old Greenwich, CT
Year Founded
2007
Social Media