The Role
The Data Center Site Manager owns end-to-end reliability, safety, capacity, and performance for one of our flagship U.S. sites. You'll lead a high-performing, multi-disciplinary operations team and partner tightly with Design, Build, Network, Security, Capacity Planning, and the DC orgs to deliver world-class availability and cost efficiency.
Your responsibilities will include:
Own the site 24/7: deliver continuous availability across power, cooling, structured cabling, network, security, and DCIM-meeting or beating global SLAs.
Build and lead the team: hire, mentor, and develop managers/technicians; run staffing models, shift coverage, and on-call rotations that scale.
Be the incident commander: lead major events end-to-end-triage, communications, executive briefings, RCA, and durable corrective actions.
Drive reliability engineering: implement RCM, predictive maintenance, QA/QC, 5S, and Lean/continuous improvement to cut MTTR and raise MTBF.
Deliver capacity on time: plan and execute expansions/retrofits; commission MEP systems with Design/Construction; achieve flawless change control (MOP/SOP/EOP).
Scale tooling & automation: mature DCIM/BMS/EPMS, monitoring/alerting, work management (Jira/ServiceNow), knowledge base (Confluence), and light scripting/SQL for telemetry and workflow automation.
Run a metrics-first operation: publish dashboards and KPIs (availability, PUE, MTBF/MTTR, work compliance, safety) and use them to drive decisions.
Partner across functions: work with Cloud/Compute, Network, Security, and Capacity Planning to optimize performance, cost, and resiliency across the fleet.
Manage vendors & colos: own contracts, SLAs, and execution for rack deliveries, PDUs, fiber/copper, and lifecycle PMs; validate colo topology and compliance.
Raise the safety bar: enforce a zero-injury EHS culture; conduct drills/audits for life safety, physical security, and data protection.
Forecast and budget: build data-backed plans for power, spares, headcount, and projects; track OpEx/CapEx with rigor.
We expect you to have:
Associate's degree or trade certification in Electrical/Mechanical/Industrial Engineering (or equivalent experience).
10+ years in electrical/mechanical/HVAC/controls within industrial/commercial settings, 5+ years specifically in data center or mission-critical facilities.
Team leadership experience in 24/7 sites (managing leads/techs, vendors, and on-call operations).
Deep, hands-on knowledge of UPS/generators/switchgear, chillers/CRAC/CRAH, fire detection/suppression, BMS/EPMS/DCIM, and structured cabling (copper & fiber).
Proven strength in incident management, RCA/Corrective Actions, change management, and vendor/contract oversight.
Data-driven mindset with the ability to forecast resources and make analytics-backed decisions (Excel; SQL/scripting a plus).
Excellent written/verbal communication with comfort presenting to executives and guiding field teams during live events.
Ability to travel up to ~30% and support after-hours escalations when needed.
It would be an added bonus if you have:
Bachelor's degree in Electrical/Mechanical/Industrial Engineering, Engineering Management, or Reliability Engineering.
Hyperscale/colo experience with reliability-centered maintenance, predictive analytics, and Lean/Six Sigma practices.
Familiarity with Linux fundamentals, network equipment installation/troubleshooting, and fiber optics testing.
Experience with Jira, Confluence, ServiceNow (or similar); strong SOP/MOP/EOP authorship.
Certifications such as CDCP, DCM, PMP, OSHA-30, ITIL, or Uptime-aligned credentials.
Key Employee Benefits
Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families.
401(k) plan: up to 4% company match with immediate vesting.
Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
Remote work reimbursement: up to $85/month for mobile and internet.
Disability & life insurance: company-paid short-term, long-term and life insurance coverage.
Compensation
We offer competitive salaries, ranging from $90k- $140k base + quarterly performance bonuses.
Join Nebius Today!