Onsite Role//Software Development Engineer, Release//San Jose, CA

TPI Global (formerly Tech Providers, Inc.)

San Jose, CA

JOB DETAILS
SKILLS
Automation, Computer Science, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Detail Oriented, Distributed Computing, Docker, GPU (Graphics Processing Unit), Open Source, Performance Analysis, Performance Metrics, Problem Solving Skills, Release Management/Engineering, Software Development, Software Engineering, Systems Reliability, Test Plan/Schedule, Testing
LOCATION
San Jose, CA
POSTED
1 day ago
Hi,
 
My name is Sudheer, and I work with TPI Global Solutions. We are a global talent solutions firm providing a wide range of staffing solutions to companies and candidates within the technology field.
 
I have reviewed your profile on one of the Job Portal and feel that your background could be a great fit for an exciting opportunity I am working on right now.
 
Title: Software Development Engineer, Release
Client: AMD
Req ID: 202-1
Location: San Jose, CA
 
THE ROLE:
We are seeking a skilled and motivated Software Development Engineer to join our Training at Scale team.
In this role, you will develop tools and automation to support large-scale model training on the latest AMD GPUs.
You’ll work closely with engineers across teams to optimize training workloads, manage CI/CD pipelines, and ensure reliable, high-performance releases. This is a hands-on engineering position with a strong focus on distributed systems, performance, and automation at scale.
 
THE PRESON:
The ideal candidate brings deep experience in open-source software (OSS) release cycles, container-based packaging (e.G., Docker), and has strong debugging skills—particularly around model training workloads. You thrive in fast-paced environments and are passionate about automation, system reliability, and continuous improvement.
 
KEY RESPONSIBILITIES:
•Manage and maintain nightly builds for multiple training frameworks
•Collaborate on integrating new training workloads and expanding test coverage
•Ensure the stability and releasability of the main branch at all times
•Update and maintain build processes to support biweekly release and performance goals
•Handle and deliver ad-hoc development test builds as requested
•Track build performance and reliability metrics over time
 
PREFERRED EXPERIENCE:
•Experience with open-source software contributions and release management
•Strong hands-on experience with Docker and container-based workflows
•Excellent problem-solving skills and attention to detail
•Ability to work independently and a willingness to learn new technologies quickly
 
ACADEMIC CREDENTIALS:
•Bachelor’s degree in Computer Science, Engineering, or a related technical field
  Sudheer Paswan
Sr Executive Resourcing | TPI Global Solutions
+1 4706329257
spaswan@tpiglobalsolutions.Com;
www.Tpiglobalsolutions.Com
 
 

About the Company

T

TPI Global (formerly Tech Providers, Inc.)