Senior Site Reliability Engineer

Yellosa • johannesburg, gauteng • Posted June 13, 2026

About the Role

Responsibilities

  • Own reliability, availability, scalability, and security of production systems
  • Design and operate highly available, fault‑tolerant, multi‑region cloud architectures
  • Define and manage SLOs, SLIs, SLAs, and error budgets for critical services
  • Lead high‑severity incidents and drive effective post‑incident reviews
  • Improve MTTD and MTTR through automation, tooling, and runbooks
  • Operate and evolve Kubernetes (EKS) platforms and multi‑tenant deployments
  • Work with Infrastructure‑as‑Code (Terraform, CloudFormation, Pulumi) at scale
  • Build and improve CI/CD pipelines and deployment safeguards
  • Design and maintain observability (metrics, logs, traces, alerting)
  • Drive capacity planning, performance optimisation, and cloud cost efficiency
  • Partner with Security & Compliance on SOC 2, ISO 27001, GDPR, and DORA controls
  • Mentor SREs and influence reliability‑first engine...