p>Preferred Qualifications: • Experience managing Linux operating systems in a large-scale system environment • Solid understanding of networked computing environment concepts • 8+ years of experience with Linux Cluster Administration • Ability to develop and maintain programs and scripts that aid in the operation and automation of administrative tasks using various shell and scripting languages (bash, Python, Go) • Experience with Lustre and GPFS file systems • Experience with batch schedulers (particularly SLURM) • Experience deploying and maintaining automated configuration management software such as Puppet • Strong interpersonal and communication skills • Ability to work as a team player • Proactive and solution-oriented problem solver • Prior project and/or team leadership experience.
Major Duties/Responsibilities:
• Install, integrate, and administer HPC Linux clusters and high-speed network • Diagnosing system operational problems quickly and effectively • Coordinating with vendors to resolve hardware and software problems • Recommending, planning, and coordinating hardware and software changes with customer participation using change management processes • Porting and writing system management tools • Documenting system administration procedures for routine and complex tasks • Participating in a 24-hour, 7-day on-call support rotation and off-hours maintenance windows • System implementation/integration into the NCCS environment and systems performance analysis.