Datacentre Operations Engineer
Company Overview:
Ori Industries is at the forefront of AI infrastructure, revolutionising the connection between software and hardware for the AI era. Our mission is to empower AI teams with scalable, secure, and efficient infrastructure solutions that support seamless model training, deployment, and scaling.
Job Summary:
We’re looking for a qualified, experienced Datacentre/Hardware Engineer to run our muli-million dollar HPC infrastructure based in Dallas Fort Worth, US. You’ll be well versed with managing and optimising datacentres, dealing promptly with hardware failures, optimising environmental performance as well as deploying new hardware and services 24/7 x 365. You’ll be hands on with high performing HPC compute and will operate with utmost diligence, professionalism and focus to ensure the equipment underpinning our services operate at peak performance.
What You’ll Do :
Troubleshooting and Support: Quickly diagnose and resolve hardware and network issues to maximise uptime.
Respond to critical hardware alerts via our monitoring and observability platform.
Contribute to ongoing service improvement to improve our monitoring capability RMA and Support: Manage vendor relationships, handling RMAs and support requests within Ori’s Service Level Objectives (SLOs) to meet customer contract SLAs.
Data Center Management: Guide data center acquisition, setup, and ongoing maintenance, fostering compliance and leveraging strong vendor partnerships.
Fully own acquisition of hardware assets from the point of purchase and delivery, through lifecycle management and disposal - all while owning asset management within ORI’s CMDB system.
Hardware Installation and Maintenance: Deploy and maintain HPC and AI hardware for uninterrupted operations, including performing low-level system maintenance such as hardware troubleshooting, firmware updates, and replacement of components as needed.
Datacenter Environment Technologies: Oversee cooling, power distribution, and other critical data center technologies to maintain high operational standards.
Capacity Planning and Resource Allocation: Support strategic planning to align infrastructure capabilities with current and projected demands.
Develop and maintain datacentre/hardware management SOP’s ensuring continual alignment with ORI’s governance and compliance requirements
Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement.
Operate and support services 24x7x365 for production environments, including on-call rotation
Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations
Mentor junior engineers and act as an Operational requirements consultant to other departments
Communicate technical decisions clearly to non-technical stakeholders and customers
Uphold a culture of: do, document, automate
Willing to cross train and upskill in Infrastructure/Platform SRE practises.
Willing to travel across North America to support future datacentre onboarding and deployments.
What you bring:
Degree in Computer Science, or 10 years industry experience.
3+ years of experience in data center operations, HPC, or related roles.
Proven track record working with HPC Nvidia GPU or equivalent systems, high-performance storage, and networking.
Expertise in hardware installation, network configuration, and low-level system maintenance, including hardware troubleshooting and firmware management.
Knowledge of data center environment technologies, including cooling and power distribution.
Experience in data center design, greenfield deployments, and operations.
Strong understanding of hardware and spares management, with the ability to handle
RMAs and support cases within defined SLOs to meet SLA requirements.
Solid understanding of HPC and AI workloads.
Strong problem-solving abilities and the resilience to thrive in a fast-paced environment.
Excellent communication skills and ability to collaborate with cross-functional, internationally dispersed teams.
Strong grasp of ITSM and service operation best practices
Excellent communication and mentorship skills
Comfortable interfacing with internal stakeholders and external customers
Bonus: Specific vendor endorsed qualifications from Supermicro or Dell for HGX based systems
Preferred Skills (Nice to Have)
Knowledge of large scale private cloud deployments and capacity planning.
Qualifications in HVAC management and deployments●
Certifications in relevant areas - Hardware, Networking
ITIL Foundation level qualification or equivalent experience
How you work:
You approach problems with a systems mindset - balancing practical execution with long-term scalability
You elevate the team, setting high standards for technical quality and engineering excellence.
You hold yourself and others accountable - giving direct feedback and expecting the same
You take initiative, owning challenges end-to-end and proactively driving solutions.
You invest in others, mentoring to build both capability and confidence.
You communicate clearly - translating complexity into clarity across engineering and business audiences
Equal Opportunity Employer
Ori is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
Set the standard: Every single day, you spot opportunities to constructively shake things up
Inspire the change: There’s no blueprint for the future. You’ll embrace challenges and change
You’re real and you’re true to yourself: We cherish and celebrate diversity so you’ll feel right at home whoever you are and whoever you’re talking to, you treat everyone the same.
- Department
- Engineering
- Locations
- Dallas-Fort Worth Metro Area, TX