The Senior Data Center Operations Engineer plays a critical, hands-on role in supporting the build-out and long-term operation of a high-performance, enterprise-scale data center environment supporting advanced compute and large-scale infrastructure deployments.
This position is designed for an experienced engineer with deep expertise in server hardware, Linux systems, and data center operations, operating within environments that demand high availability, precision, and performance. You will contribute during the initial deployment phase, supporting infrastructure bring-up, validation, and hardware readiness. As the environment transitions into steady-state operations, you will take ownership of ongoing reliability, advanced troubleshooting, and continuous improvement initiatives.
This role requires a strong operator mindset-someone who thrives in complex, production-critical environments and takes pride in resolving issues at their root. You will serve as a primary technical escalation point, working closely with engineering and infrastructure teams to maintain system stability and performance.
You will collaborate with cross-functional teams, making clear and professional communication in English (written and verbal) essential for success in this role.
This role offers continuity across both deployment and operational phases and provides exposure to large-scale, modern infrastructure environments, with a clear path for progression into advanced technical or engineering roles.
Key Responsibilities
Advanced Hardware Troubleshooting & Repair
Diagnose and resolve complex hardware failures across server platforms (motherboards, CPUs, memory, storage)
Perform component-level repairs and replacements on servers and data center hardware
Execute break/fix processes with a focus on minimizing downtime and meeting SLAs
Conduct root cause analysis (RCA) of hardware failures and implement preventative improvements
Identify recurring failure trends and contribute to tooling, automation, and process enhancements
Linux Systems & Platform Support
Utilize Linux command-line tools for system monitoring, diagnostics, and troubleshooting
Support provisioning and deployment of servers across Linux distributions (RHEL, Ubuntu, etc.)
Troubleshoot boot-level and OS-level issues in production environments
Collaborate with engineering teams to resolve complex hardware/software interaction issues
Data Center Operations
Support hardware installation, structured cabling, and infrastructure validation
Maintain accurate inventory of spare parts, assets, and retired equipment
Document repairs, changes, and configurations in ITSM/DCIM systems
Ensure adherence to safety, security, and operational protocols
Serve as a primary escalation point for complex infrastructure issues
Participate in on-call rotation supporting 24x7 operations
Collaboration & Mentorship
Provide guidance and mentorship to technicians on hardware troubleshooting and best practices
Collaborate with network, storage, and infrastructure teams to resolve cross-functional issues
Contribute to knowledge sharing, documentation, and operational excellence initiatives
Support continuous improvement efforts across processes, tooling, and operational workflows
Required Skills
Strong English communication skills (written and verbal) are required for coordination with cross-functional teams
Expert-level knowledge of server hardware architecture and component-level troubleshooting
Strong proficiency with Linux systems and command-line diagnostics
Solid understanding of networking fundamentals and infrastructure components
Experience working within structured operational environments (SOPs, SLAs, ticketing systems)
Familiarity with ITSM/DCIM tools (ServiceNow, Jira, or similar)
Experience with structured cabling and fiber optic connectivity
Strong analytical and problem-solving skills with attention to detail
Ability to operate effectively in high-pressure, high-availability environments
Strong organizational and documentation skills
Required Experience
5+ years of experience in data center operations or similar infrastructure environments
Significant hands-on experience with server hardware troubleshooting and repair
Minimum of 2 years of experience working with Linux operating systems in production environments
Experience supporting enterprise server platforms and infrastructure environments
Demonstrated experience performing root cause analysis and resolving complex hardware issues
Experience working within ticketing systems and operational workflows
Exposure to data center build-outs, deployments, or infrastructure upgrades (preferred)
Preferred Certifications
CompTIA A+, Server+, or Linux+
LPI certification or equivalent
Vendor-specific hardware certifications
Physical Requirements
Ability to lift and move equipment up to 50 lbs
Ability to work in a temperature-controlled environment with moderate noise levels
Ability to perform physical tasks such as standing, walking, bending, and kneeling for extended periods
At Milestone, we know IT, and we’re consistently driving innovation in infrastructure operations to improve the overall customer experience. As a Managed Services Provider (MSP), we use technology intelligently to make IT infrastructures smarter, streamlined, and ultimately, more successful.
Our seasoned professionals deliver services based on Milestone’s best practices and service delivery framework. By leveraging our vast knowledge base to execute initiatives, we deliver both short-term and long-term value to your company and apply continuous service improvement to deliver transformational benefits to IT. With Intelligent Automation, Milestone helps businesses further accelerate their IT transformation. The result is a sharper focus on business objectives and a dramatic improvement in employee productivity. Through our key technology partnerships and our people-first approach, Milestone continues to deliver industry-leading innovation to our clients.
Since our inception in 1997, our clients have benefited from more efficient operations and a renewed focus on employee development and business innovation. When founder, Prem Chand, started Milestone Technologies, Inc. he aimed to solve the growing problem of IT Relocation for Silicon Valley businesses. Today, with more than 2,000 employees serving a substantial client base of over 200 companies worldwide, we are following our mission of revolutionizing the way IT is deployed around the globe.