Icon hamburger
US
What job do you want?
Apply to this job.
Think you're the perfect candidate?
Apply on company site

You’re being taken to an external site to apply.

Create an account to get recommended jobs that match your resume and apply to multiple jobs in seconds!
On the next page, you'll build a resume in 3 steps thanks to our AI technology
8-15 characters
Contains Number
Contains Lowercase
Contains Uppercase
Contains Special Character
Thumsup

You’re being taken to an external site to apply.

Enter your email below to receive job recommendations for similar positions.
Default5

Sr. High Performance Computing (HPC) System Administrator - NIH - Bethesda, MD

General Dynamics Information Technology Bethesda, MD Full-Time
Apply on company site

GDIT supports the world’s largest supercomputer dedicated to life sciences and biomedical research. In the past two years we’ve delivered 14X growth in compute and 5X in storage for NIH. We are looking to expand our NIH onsite HPC support with individuals who can also contribute to GDIT’s HPC Center of Excellence across programs in NASA, NOAA, Defense and Intelligence communities.

JOB DESCRIPTION

We are looking for an experienced individual with a strong Linux background, configuration management, systems automation and network monitoring to join our team to administer the NIH Biowulf supercomputer. The position is full-time onsite at the NIH main campus in Bethesda, Maryland.

RESPONSIBILITIES & DUTIES

This position supports the HPC systems administrative team in operating and maintaining the 3,000 node Linux cluster for ~2,000 biomedical researchers. Specific responsibilities include:

  • Work with systems staff to enhance configuration management infrastructure
  • Evaluate performance impacts of planned operating system changes
  • Update and expand existing systems monitoring capabilities
  • Develop automation tools for cluster administration
  • Participate in resource optimization and job scheduling software and policies
  • Provide technical support to researchers using HPC resources, troubleshoot problems and develop appropriate computational strategies
  • Consult and collaborate with scientist coworkers to determine best system configurations for applications

QUALIFICATIONS & SKILLS

Required:

  • BS degree and 8+ years of related experience; or a MS and 6+ years of related experience or the equivalent combination of experience/education.
  • Minimum of eight years RedHat or CentOS Linux system administrator experience in an HPC environment.
  • Minimum of 9+ years programming experience in at least two languages: C/C++, Python and Perl.
  • Prior experience with configuration management tools, such as Ansible, Chef, Puppet, Cobbler.
  • Demonstrated ability to configure, deploy and manage a major system area such as batch system, network, data storage, backup system, database system, or distributed computing

Preferred:

  • Experience with batch systems such as SLURM or PBS
  • Experience managing parallel and cluster file systems such as NFS, GPFS, or Lustre
  • Network management experience especially Infiniband
  • Experience integrating applications with cloud provider software stack
  • Experience presenting and/or teaching

ATTRIBUTES FOR SUCCESS

  • Provide leadership and technical expertise to improve HPC cluster performance and resiliency
  • Ability to work both independently and as part of the team; flexibility in dealing with assignments and in working on several projects simultaneously
  • Ability to effectively communicate with people of diverse backgrounds and computer knowledge

SUMMARY

  • The position is full-time onsite at the NIH main campus in Bethesda, Maryland.
  • Limited off-hour system maintenance activities will be planned in advance.
  • There are no travel requirements.
  • Applicants must be US citizen or permanent resident to meet moderate level security requirement of facility.
We are GDIT. The people supporting some of the most complex government, defense, and intelligence projects across the country. We deliver. Bringing the expertise needed to understand and advance critical missions. We transform. Shifting the ways clients invest in, integrate, and innovate technology solutions. We ensure today is safe and tomorrow is smarter. We are there. On the ground, beside our clients, in the lab, and everywhere in between. Offering the technology transformations, strategy, and mission services needed to get the job done.

GDIT is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status, or any other protected class.

 

Recommended skills

Information Security
File Systems
Administration
High Performance Computing
C++ (Programming Language)
C (Programming Language)
Apply to this job.
Think you're the perfect candidate?
Apply on company site

Help us improve CareerBuilder by providing feedback about this job: Report this job

Report this Job

Once a job has been reported, we will investigate it further. If you require a response, submit your question or concern to our Trust and Site Security Team

Job ID: RQ33213

CAREERBUILDER TIP

For your privacy and protection, when applying to a job online, never give your social security number to a prospective employer, provide credit card or bank account information, or perform any sort of monetary transaction. Learn more.

By applying to a job using CareerBuilder you are agreeing to comply with and be subject to the CareerBuilder Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.