HPC Slurm Administrator

APN consulting Group

Kalamazoo, MI

JOB DETAILS
JOB TYPE
Contractor
SKILLS
Administrative Skills, Ansible, Artificial Intelligence (AI), Bash Scripting, Broadband, Business Solutions, Cloud Computing, Configuration Management, Consulting, File Systems, High Availability, High Reliability, Identify Issues, Linux Administration, Operations Processes, Performance Analysis, Performance Management, Problem Solving Skills, Process Improvement, Puppet (Configuration Management), Python Programming/Scripting Language, Scripting (Scripting Languages), ServiceNow, Software Engineering, Software Patches, Systems Administration/Management, Technical Recruiting, User Account Administration
LOCATION
Kalamazoo, MI
POSTED
30+ days ago
APN Consulting, Inc. is a progressive IT staffing and services company offering innovative business solutions to improve client business outcomes. We focus on high impact technology solutions in ServiceNow, Fullstack, Cloud & Data, and AI / ML. Due to our globally expanding service offerings we are seeking top-talent to join our teams and grow with us.Job Title: HPC Slurm AdministratorWork Location & Reporting Address: Kalamazoo MI 49007 ( Hybrid )Contract duration 12 MonthsTarget Start Date: ASAPDoes this position require Visa independent candidates only? / YesInterview Process (Is face to face required?) / YesJob Details:Mandatory Skill set –Experience in Linux system administration, preferably in HPC environments.Strong expertise with Slurm workload manager.Proficiency in Bash, Python, or other scripting languages.Familiarity with parallel file systems and high-speed networking (e.g., InfiniBand).Experience with configuration management tools (e.g., Ansible, Puppet).Detailed Job DescriptionZoetis is seeking a skilled HPC Slurm Administrator to manage and support high-performance computing (HPC) environments. The ideal candidate will have hands-on experience with Slurm workload manager and Linux system administration, and will play a key role in maintaining, optimizing, and scaling HPC infrastructure.Key Responsibilities: • Administer and maintain HPC clusters using Slurm. • Monitor system performance and ensure high availability and reliability. • Troubleshoot and resolve issues related to job scheduling, compute nodes, and storage. • Manage user accounts, permissions, and security policies. • Automate administrative tasks using scripting languages (e.g., Bash, Python). • Collaborate with engineering and research teams to support compute-intensive workloads. • Document system configurations, procedures, and operational changes. • Participate in upgrades, patching, and scaling of HPC infrastructure.Minimum years of experience needed- 3+ years of experienceCertifications Needed :NoWe are committed to fostering a diverse, inclusive, and equitable workplace where individuals from all backgrounds feel valued and empowered to contribute their unique perspectives. We strongly encourage applications from candidates of all genders, races, ethnicities, abilities, and experiences to join our team and help us build a culture of belonging.

About the Company

A

APN consulting Group