Senior Data Center Operations Engineer
2/25/2026
The Senior Data Center Operations professional is responsible for ensuring the availability, reliability, and operational excellence of mission-critical data center infrastructure, acting as a technical lead for complex operational activities and incident response. Key duties include overseeing hardware maintenance, leading incident resolution across hardware/OS/networking/storage, driving RCAs, and owning reliability programs.
Working Hours
40 hours/week
Company Size
51-200 employees
Language
English
Visa Sponsorship
No
How the process looks like
Contact information:
About the role
The Senior Data Center Operations professional is responsible for the availability, reliability, and operational excellence of mission-critical data center infrastructure. This role operates as a technical lead on site, owning complex operational activities, incident response, and advanced troubleshooting while mentoring junior technicians and supporting continuous improvement initiatives.
The role requires deep hands-on expertise, sound judgment in high-pressure situations, and the ability to operate independently within a 24/7 critical environment.
Technical Authority & Ownership
- Senior-level ownership of data center operations and infrastructure stability
- Authority to lead incident response and complex troubleshooting activities
- Recognition as a subject-matter expert within the operations team
- Direct influence on operational standards, procedures, and improvements
Professional Standing & Growth
- Positioning as a senior technical reference within the data center
- Opportunity to mentor technicians and shape operational best practices
What's next
Why Verda
Application deadline:
Practicalities
Your responsibilities
Senior Operations & Infrastructure Management
- Oversee installation, configuration, testing, and maintenance of critical data center hardware and systems
- Ensure operational readiness and compliance with availability, security, and safety standards
- Act as escalation point for complex or high-impact operational issues
Monitoring, Incident Leadership & Troubleshooting
- Lead response to infrastructure incidents, outages, and performance degradation
- Perform advanced troubleshooting across hardware, OS, networking, and storage layers
- Coordinate with engineering, network, facilities, and vendor teams during incidents
- Drive root cause analysis (RCA) and corrective actions
Preventive Maintenance & Reliability
- Own and improve preventive and predictive maintenance programs
- Validate maintenance procedures and execution quality
- Identify risks, single points of failure, and reliability gaps
Project Execution & Change Management
- Lead or support complex operational projects such as:
- Data center expansions
- Hardware refresh programs
- Infrastructure upgrades
- Execute changes in line with change management and risk controls
Documentation, Standards & Mentorship
- Own and maintain senior-level operational documentation and SOPs
- Contribute to audits, compliance reviews, and operational assessments
- Mentor and support junior and mid-level technicians
- Promote a strong culture of safety, discipline, and continuous improvement
Your key competencies
Education
- Degree in Computer Science, Information Technology, Engineering, or equivalent experience
Experience
- 8–12+ years of experience in data center operations or mission-critical IT environments
- Proven experience leading operational activities in 24/7 critical facilities
- Demonstrated ownership of incident management and reliability initiatives
Technical Expertise
- Deep hands-on expertise with:
- Server, storage, and rack infrastructure
- Networking fundamentals and connectivity troubleshooting
- Linux
- Strong understanding of monitoring, DCIM, ticketing, and change management tools
Leadership & Personal Attributes
- Strong decision-making under pressure
- High sense of ownership and accountability
- Ability to mentor and guide less experienced technicians
- Excellent written and verbal communication skills in English
- Proactive, detail-oriented, and reliability-focused mindset
Success criteria for this role in the next 6-12 months
Please let Verda know you found this job on InterviewPal. This helps us grow!
We scan and aggregate real interview questions reported by candidates across thousands of companies. This role already has a tailored question set waiting for you.
Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.