Principal Site Reliability Engineering Manager- CTJ- Secret (Cleared Environments)
Microsoft Corporation • Redmond, WA • Posted June 13, 2026
About the Role
**Overview**
Microsoft **Substrate** is the foundational cloud platform that powers many of Microsoft’s most critical services including **Exchange Online** and **M365 Copilot** , providing shared infrastructure, identity, messaging, storage, and service-to-service capabilities used across Microsoft 365 and related cloud offerings. Substrate services operate at global scale and are designed to deliver high availability, reliability, and security for some of the world’s most demanding workloads.
We are seeking a **Principal Site Reliability Engineering Manager** to lead a team responsible for building and operating Substrate services in **highly regulated environments** . In these environments, success depends not only on operational excellence, but also on **strong software engineering fundamentals** —clean, maintainable code; sound design decisions; robust telemetry; and disciplined engineering lifecycle practices that make systems reliable by design.
<...
Microsoft **Substrate** is the foundational cloud platform that powers many of Microsoft’s most critical services including **Exchange Online** and **M365 Copilot** , providing shared infrastructure, identity, messaging, storage, and service-to-service capabilities used across Microsoft 365 and related cloud offerings. Substrate services operate at global scale and are designed to deliver high availability, reliability, and security for some of the world’s most demanding workloads.
We are seeking a **Principal Site Reliability Engineering Manager** to lead a team responsible for building and operating Substrate services in **highly regulated environments** . In these environments, success depends not only on operational excellence, but also on **strong software engineering fundamentals** —clean, maintainable code; sound design decisions; robust telemetry; and disciplined engineering lifecycle practices that make systems reliable by design.
<...