Ansible, Authentication, Automation, Bash Scripting, CPU (Central Processing Unit), Capacity Management, Configuration Management, DHCP (Dynamic Host Configuration Protocol), DNS (Domain Name System), Data Sets, Disaster Recovery, Engineering Software, Establish Priorities, Forecasting, Hypervisors, Identify Issues, Identity Data Management, Input/Output, Kernel Programming, LDAP (Lightweight Directory Access Protocol), License Management, Linux Administration, Linux Operating System, MPI, Maintain Compliance, Microsoft Active Directory, Microsoft Hyper-V, Microsoft Windows Azure, Microsoft Windows NT Group Policy, Microsoft Windows Operating System, Microsoft Windows Server, NFS (Network File System), Parallel Computing, Performance Management, Performance Tuning/Optimization, Problem Solving Skills, Python Programming/Scripting Language, Red Hat Linux Operating System, Scripting (Scripting Languages), Software Administration, Software Engineering, Software Simulation, Systems Administration/Management, Systems Maintenance, Technical Support, VMWare vSphere, Virtual Machine (VM), Virtualization, Windows PowerShell, iSCSI
Overview
Calspan is seeking a highly skilled Systems Administrator to manage our core infrastructure and High Performance Computing (HPC) environments. This role is responsible for the stability of our Windows Servers (including Active Directory, GPO, Domain Controllers, Azure, DHCP, DNS), Red Hat Enterprise Linux (RHEL) clusters, virtualization layer, and HPE Alletra/Nimble storage systems. You will act as the primary administrator for our engineering compute resources, supporting critical simulation software such as Ansys Fluent and Star-CCM+. The ideal candidate is an "automate-first" professional who combines strong Linux skills with the ability to troubleshoot complex engineering workloads.
Responsibilities
----------------
### High Performance Computing (HPC) & Engineering Support
#### HPC Cluster Administration
- Deploy, configure, and manage Linux-based HPC clusters.
- Monitor node health, job queues, and system performance to ensure maximum throughput for engineering simulations.
#### Application Support
- Troubleshoot and optimize engineering simulation software, specifically Ansys Fluent and Star-CCM+.
- Resolve issues related to MPI libraries, parallel processing, and solver convergence errors caused by infrastructure constraints.
#### License Management
- Manage floating license servers (e.g., FlexNet/FlexLM) for engineering applications to ensure availability and compliance.
#### Job Scheduling
- Administer workload managers/job schedulers (e.g., Slurm, PBS, or LSF) to prioritize and distribute engineering jobs effectively.
### System Administration (Red Hat Focus)
#### RHEL Administration
- Expert-level management of Red Hat Enterprise Linux.
- Handle kickstart deployments, satellite management (if applicable), kernel tuning, and security hardening.
#### Virtualization
- Manage the hypervisor layer (e.g., VMware vSphere, HyperV).
- Handle VM provisioning, resource pooling (CPU/RAM), and performance tuning for virtualized engineering workloads.
### Identity & Access Management
- Administer user access in Active Directory/Entra ID, ensuring seamless authentication for both Windows and Linux/HPC environments (via LDAP/SSSD).
### Storage Management (HPE/Nimble Focus)
#### HPE Storage Administration
- Manage the complete lifecycle of HPE storage infrastructure, specifically HPE Nimble arrays.
#### Performance Tuning
- Optimize storage protocols (NFS/iSCSI) to handle the high I/O throughput required by HPC simulations.
#### Capacity Planning
- Utilize HPE InfoSight to monitor trends and forecast storage needs for large engineering datasets.
#### Backups & DR
- Maintain robust backup strategies and Disaster Recovery plans for critical engineering data.
### Infrastructure Automation
#### Infrastructure-as-Code
- Write and maintain scripts using Python, Bash, or PowerShell to automate cluster node provisioning and system maintenance.
#### Configuration Management
- Use tools like Ansible to enforce consistent configurations across the HPC nodes and general server estate.
Qualifications
--------------
### Experience
- 3-5+ years of experience in Linux Administration, with specific exposure to HPC environments.
### OS Expertise
- Deep proficiency in Red Hat Enterprise Linux (RHEL) is required.
### HPC & Software
- Experience administering HPC clusters and supporting engineering applications (Ansys Fluent, Star-CCM+).
### Storage
- Hands-on experience with HPE Nimble Storage.
### Scripting
- Proficiency in Bash or Python for system automation.
Preferred (Bonus) Skills
----------------------
- Experience with InfiniBand or high-speed low-latency networking.
- Familiarity with job schedulers like Slurm, PBS Pro, or LSF.
- Knowledge of containerized HPC workloads (Singularity/Apptainer).
- Experience with HPE InfoSight analytics.
Why Join Calspan?
-----------------
Calspan is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status. Calspan supports safe and drug-free workplace through pre-employment background checks and drug testing. The salary range provided is a general guideline. Actual pay will depend on several factors, including, but not limited to, education, experience, training, and other applicable qualifications. Calspan is committed to pay transparency in compliance with applicable state and local laws. All candidates must be eligible to work in the United States.
Salary Range
------------
- Minimum: USD $80,000.00/Yr.
- Maximum: USD $110,000.00/Yr.