Senior Systems Software Engineer, Observability and Telemetry Platform

NVIDIA • Santa Clara, CA • Posted June 29, 2026

About the Role

Senior Systems Software Engineer (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demands knowledge across different systems, networking, coding, database, capacity management, continuous delivery and deployment and open source cloud enabling technologies like Kubernetes and OpenStack. Senior Systems Software Engineer (SRE) at NVIDIA ensures that our internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and at the same time enabling developers to make changes to the existing system through careful preparation and planning while keeping an eye on capacity, latency and performance. Senior Systems Software Engineer (SRE) is also a mindset and a set of engineering approaches to running better production systems and optimizations. ...