AI Lab Technical Support Specialist

Blue Ribbon Global Technologies

Milpitas, CA

JOB DETAILS
SKILLS
Artificial Intelligence (AI), Background Investigation, Bash Scripting, CPU (Central Processing Unit), Command Line, Computer Architecture, Computer Hardware, Computer Networks, Computer Servers, Computer Systems, Electrical Engineering, GPU (Graphics Processing Unit), Gaming, Git, Hardware Installation, Identify Issues, Jenkins, Laboratory Management, Leading Edge Technology, Lift/Move 50 Pounds, Linux Operating System, Mechanical Assembly, Network Operations Center, Network Systems, Operating Systems, Purchasing/Procurement, Python Programming/Scripting Language, Research & Development (R&D), Server Clusters, Server Hardware, Software Development, System Validation, Systems Administration/Management, Technical Support, Writing Skills
LOCATION
Milpitas, CA
POSTED
3 days ago

Role Overview


We're looking for a hands-on engineer/technician to assist with the setup, maintenance, and operation of our high-performance computing cluster.

This role is ideal for someone with practical experience in Linux systems in the data center who enjoys working in a fast-paced technical environment.



Key Responsibilities


Racking, Stacking, Cabling and maintenance the AI data center and lab.

Perform routine maintenance and troubleshooting on Linux servers, storage and networking systems.

Use tools to monitor and troubleshoot hardware issues.

Work closely with engineers and developers to ensure smooth operation of the AI infrastructure.



Required Skills/Experience


Experience with assembly of mechanical or electrical systems, or performing component-level repairs and troubleshooting on technical equipment.

Ability to lift/move 50lb (23kg) of equipment and ability to exert yourself physically over extended periods of time, including frequent bending, kneeling, climbing, pushing/pulling and lifting.

Experience working within a data center or network operation center environment.

Comfortable working in a Linux environment & ability to diagnose and troubleshoot issues in operating systems, computer/server hardware or networking stack.

Able to write and understand simple Bash or Python scripts.

Exposure to Git, Jenkins, or similar tools is a plus.



?Role Overview

This role is a hands-on, hardware-focused technical support position supporting GPU/compute clusters in an AI lab/R&D environment. The emphasis is on hardware troubleshooting, Linux-based system support, and deep understanding of compute architecture, rather than software development.


Key Responsibilities


Troubleshoot GPU/CPU servers, compute clusters, and networking (InfiniBand)

Diagnose hardware issues (cabling, components, GPUs, servers)

Rack/stack initially limited (systems already built), but may increase if extended

Replace/install server components within racks

Use Linux command line extensively for diagnostics and system validation

Manage lab space and hardware inventory (re-procurement access provided)



Must-Have Skills (Non-Negotiable)


Strong hardware troubleshooting experience (servers, GPUs, compute systems)

Solid understanding of computer/compute architecture

Strong Linux skills for system bring-up and troubleshooting

Experience with GPUs and high-performance compute environments

Ability to independently diagnose and resolve hardware/system issues



Preferred / Nice-to-Have


Prior data center or HPC/compute cluster experience (plus, not mandatory)

Scripting experience (Bash, Python) expected if candidate has done similar roles

Familiarity with GPU technologies (cutting-edge R&D GPUs; Tesla, etc.)

Candidates who've built systems themselves (gaming PCs, lab servers, small data centers)



Experience & Education


Minimum: 3 4 years of relevant experience (not pure sysadmin only)

Bachelor's degree preferred, but experience matters more than degree

No travel required

Required Skills :

Basic Qualification :

Additional Skills :

Background Check : No

Drug Screen : No

About the Company

B

Blue Ribbon Global Technologies