Computer Atlas

Operations and Reliability

Running software in production — deployment, observability, SRE, incident response, and reliability.

Operations and reliability are what keep software running once it ships: deployment, monitoring, on-call, incident response, and the engineering discipline of building systems that stay up.

Core

The essentials. Start here.

Important

What you'll meet next.

Supplemental

Niche, historical, or specialized.