Job Description
About the Team
The Datacenter Facility Operation team supports the company's fast growth by operating and maintaining hyperscale datacenters. The team ensures 24/7 uptime, reliability, and efficiency of the critical infrastructure (Power and Cooling) that keeps our server fleet running seamlessly. We focus on engineering excellence, hands-on troubleshooting, and executing high-standard operational procedures.
About the Role
We are seeking a hands-on Datacenter Facility Operation Engineer to execute the daily operations, maintenance, and emergency response for our critical infrastructure. In this role, you will be the first line of defense for datacenter uptime. You are highly technical, detail-oriented, and deeply familiar with the physical infrastructure of hyperscale environments. Instead of just managing vendors from a desk, you will work alongside them, leading troubleshooting efforts, performing technical inspections, and ensuring that all maintenance aligns perfectly with global safety and operational standards.
Responsibilities
- Daily Operations & Inspection: Perform routine rounds, inspections, and health checks of critical mechanical and electrical systems to identify and mitigate anomalies before they impact operations. Regularly audit, review, and collaborate with our colocation partners on their maintenance program and effectiveness.
- Hands-on Troubleshooting & Incident Response: Serve as the primary technical responder for facility incidents.
- Work with colocation partners to lead fault isolation, root-cause analysis (RCA), and emergency switching/restoration procedures for power and cooling systems during outages.
- Maintenance Execution & Oversight: Directly supervise and technically validate preventative/corrective maintenance performed by colocation partners and vendors on critical assets (Generators, UPS, Chillers, AHUs). Manage and drive the change management process for high-risk maintenance activities in our data centers.
- High-Risk Change Execution: Act as the on-site technical authority for Critical Environment Work Authorizations (CEWA) and Method of Procedures (MOPs). Ensure all switching procedures and high-risk isolation steps are executed flawlessly.
- Deployment & Commissioning Support: Support Data Hall Fit-Outs and rack energization. Participate actively in L1–L5 commissioning tests, verifying that newly delivered infrastructure meets strict design specifications and operational tolerances.
- Operational Readiness: Maintain updating of accurate site documentation, single-line diagrams, SOPs, and EOPs (Emergency Operating Procedures). Participate in regular emergency drills to ensure maximum team readiness. Be responsible for the operation of colocation data center infrastructure, ensuring stability, exploring optimization, and improving data center management
- Cross-Functional Collaboration: Continually maintain positive and collaborative working relations with partner teams, vendors, teammates, and internal customers.
annually.