Overview
We are seeking an elite Solutions Architect to lead the end-to-end design, sizing, and deployment of NVIDIA AI Factory-aligned infrastructure. In this highly technical, customer-facing role you will translate complex AI and machine learning workload requirements into fully engineered infrastructure solutions spanning colocation facilities, GPU compute, high-performance networking, parallel storage, and the complete NVIDIA AI software stack.
You will serve as a trusted technical advisor to enterprise and hyperscale customers, partnering with sales, product, and engineering teams to win and deliver transformational AI infrastructure programs. Your expertise will directly shape how organizations build and operate production AI Factories capable of training frontier models, running large-scale inference fleets, and accelerating data science pipelines at scale.
Your Impact
Candidates must demonstrate deep, hands-on expertise across the following technology domains:
GPU Compute | DGX B200 / B300, DGX H100 / H200, HGX B200 / B300, HGX H100 / H200, MGX platforms, GB300 NVL72 / GB200 NVL72, RTX PRO 6000 Blackwell Server Edition, NVLink Switch System, NVLink-C2C |
Networking | NVIDIA Quantum InfiniBand (NDR 400G, HDR 200G), Spectrum-X Ethernet, ConnectX-8 / ConnectX-7 HCAs, BlueField-3 DPU, SHARP in-network computing, UFM Fabric Manager, RDMA / RoCEv2 / InfiniBand |
Storage | VAST Data Universal Storage (NFS/S3/POSIX), Hammerspace Global Data Environment, Pure Storage FlashBlade//E (Evergreen//One), NFS-over-RDMA, parallel file systems (Lustre, GPFS/WEKA), S3-compatible object storage |
| AI Software | NVIDIA AI Enterprise (NVAIE), NIM Microservices, RAPIDS (cuDF, cuML, cuGraph), NVIDIA Dynamo, CUDA Toolkit, cuDNN, NCCL, TensorRT, Triton Inference Server |
Cluster Mgmt | Base Command Manager, DGXOS, NVIDIA Mission Control, DGX Cloud, UFM, IPMI / Redfish BMC management |
Orchestration | Kubernetes (K8s), NVIDIA GPU Operator, Run:ai GPU scheduling, SLURM, OpenMPI, Helm, Argo Workflows, Kubeflow, KServe |
Colocation | Critical power design (kW – MW), UPS / generator, CRAC / CRAH / DLC / immersion cooling, hot-aisle containment, PUE optimization, carrier-neutral telecom, cross-connects, MMR design |
Frameworks | PyTorch, JAX, TensorFlow, Hugging Face Transformers, DeepSpeed, Megatron-LM, vLLM, LMDeploy |
Qualifications
Preferred Qualifications:
Technical Depth End-to-end AI infrastructure expertise from silicon to software; ability to go deep on any layer of the stack. | Systems Thinking Ability to reason holistically about performance, reliability, power, cost, and operability trade-offs across complex integrated systems. |
Customer Obsession Relentless focus on understanding customer AI objectives and delivering solutions that accelerate time-to-value. | Executive Presence Confidence and clarity when presenting complex technical architectures to senior business and technology leaders. |
Analytical Rigor Data-driven approach to workload sizing, performance modeling, and TCO analysis with attention to detail. | Collaborative Leadership Ability to lead cross-functional pursuit teams, align internal stakeholders, and orchestrate complex delivery programs. |
Position Specifics
The initial base salary range for this position is expected to be between $170,000 and $190,000 annually. The final base salary offered will be determined by multiple factors, including, but not limited to, job-related knowledge, depth of experience, skills, certifications, and geographic location. In addition to the base salary, our compensation structure may include other components such as commissions and discretionary bonuses.
ePlus offers a full range of medical, financial, and/or other benefits (including 401(k) eligibility, employee stock purchase program and various paid time off benefits, such as vacation, sick time, and personal leave), dependent on the position offered. Details of participation in these benefit plans will be provided if an offer of employment is extended.
If hired, employee will be in an “at-will position” and the Company reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time, including for reasons related to individual performance, Company or individual department/team performance, and market factors.
#LI-DY1
#IND1
Who We Are
At ePlus, we believe technology is a people business. Our team is passionate, skilled, and driven to deliver solutions that make a real difference. Join us and be part of a culture that values collaboration, innovation, and extraordinary results.
Corporate Values
Commitment to Diversity, Inclusion and Belonging