Skills

Communication Python Kubernetes Configuration Management Scripting and Automation Version Control Ansible VMware Test Problem-solving Networking Linux Windows git Shell Openstack TCP/IP Linux Administration

Job Specifications

Sustainable Talent is partnering with Nvidia a global leader who's been transforming computer graphics, PC gaming, and accelerated computing for over 25 years.

We are looking for a Systems Engineering Technician, to support our client's on-premise, private cloud infrastructure Team. This is a W-2 full-time, contract role based in Hillsboro, OR. We offer competitive pay $85-100/hr based on factors like experience, education, location, etc. and provide full benefits, PTO, and amazing company culture!

Do you thrive on cutting-edge technology and crave being challenged in a fast-paced R&D and hyperscale infrastructure environment? If so, this exciting opportunity with NVIDIA won’t disappoint. In this role, you will manage and maintain our state-of-the-art compute farm – composed of builders, packagers, testers and verification rigs – serving a global developer base working on next-generation GPU, AI/ML, accelerated computing hardware and software. The environment is vast, the scale significant, and the expectations high. We need YOU to help us deliver world-class data-centers and labs from our Hillsboro region, enabling deterministic results for our engineering teams and demanding users worldwide.

What You’ll Do

Partner closely with system architects, hardware engineers, firmware/software teams, QA/test, and platform engineers to craft, develop, deploy, debug and release next-generation NVIDIA products.
Manage and maintain a high-availability compute cluster comprising builders/packagers/testers and core support infrastructure (racks, GPU nodes, network interconnects, storage arrays).
Monitor and ensure availability targets, lead system recovery, root-cause failures in compute, network, storage and thermals, and drive rapid remediation.
Deploy, qualify, benchmark and scale new systems and hardware bring-ups in our on-prem environment (including high density GPU clusters, rack scale systems, liquid cooling environments).
Coordinate inventory, asset lifecycle, configuration management, decommissioning and refresh tasks across labs, racks and data-hall floors.
Maintain a world-class, safe, clean, organized lab and datacenter environment (cable management, ESD compliance, tool control, mechanical tasks).
Troubleshoot issues across hardware, firmware, OS (Windows, Linux, Mac) and platform-infrastructure with cross-functional platform/ops teams.
Plan, deploy and maintain on-premises infrastructure (power distribution, cooling/thermal management, UPS/Battery systems, rack/pdu/power) in collaboration with data-center and network engineering teams.
Drive efficiency improvements for availability, throughput, accuracy of test systems, while meeting internal SLAs and key operational metrics (e.g., PUE, mean-time-to-repair, throughput of test cycles).
Represent the infrastructure team in internal review meetings, collaborate globally with NVIDIA teams to align on build-out strategy, capacity planning and datacenter operations.

What We Need to See

Associate’s or Bachelor’s degree in Engineering or a Technical Major, or equivalent hands-on experience in infrastructure, hardware, or compute lab environments.
Proven experience operating in datacenter environments or large-scale engineering/test labs, especially with compute-dense/hyperscale hardware.
Familiarity with version control systems (e.g., Git, Perforce) for firmware/software and infrastructure configuration.
Proficiency with infrastructure tools such as DCIM (e.g., Nautobot), scripting and automation (shell, Python, Ansible, etc.).
Solid working knowledge of fundamental network and services protocols (TCP/IP, DNS, NFS, SSL/TLS, IPv6) and experience working with high-bandwidth, low-latency interconnects.
Experience supporting multiple OS platforms (Windows, Mac, Linux), BIOS/firmware updates, driver deployments and system imaging.
Hands-on physical experience with PCBs, GPUs, server/node deployments, rack integration, cooling/power structures, cable/fibre management.
Excellent written and verbal communication skills; ability to translate technical concepts clearly to both technical and non-technical stakeholders.
Strong analytic and problem-solving skills; ability to take ownership, collaborate effectively in fast-moving teams and drive results.

What Makes You Stand Out

Experience deploying or managing HPC or GPU-accelerated clusters, with tools such as Slurm, BCM, Kubernetes, or other orchestration frameworks.
Exposure to cloud and on-premise convergence stacks (OpenStack, VMware, Nutanix, or other private cloud infrastructure).
Certifications such as CCNA/CCNP, or equivalent networking/infrastructure credentials.
Deep background in Windows & Linux administration, dense datacenter design (compute/storage/networking), and hyperscale scale-out systems.
Familiarity with hypervisor/VM applications, container orchestration, virtualized infrastructure, bare-metal provisioning.
Understanding of advanced data-centre infrastructure design: liquid cooling, immersion cool

About the Company

Sustainable Talent delivers in-demand talent on demand, providing AI-driven, human-centric workforce solutions to help companies scale effectively. We specialize in talent acquisition, offering strategic consulting, recruitment, and specialized services that align with our clients' business objectives. Whether you're looking for contingent workers, executive talent, or tech specialists, we ensure the right people are in place at the right time to drive success. We partner with industry leaders like Amazon, Ford, and NVIDIA... Know more