Distributed Systems and Cloud
Systems that span many machines — consensus, replication, sharding, cloud platforms, and microservices.
Distributed systems are what happens when one computer is not enough. This section covers consensus, replication, sharding, fault tolerance, and the cloud platforms built on top.
Core
The essentials. Start here.-
Container
A lightweight, isolated unit of computing — a process running with its own filesystem and namespaces, packaged with its dependencies.
core intermediate technology -
Distributed System
A system whose components run on multiple networked computers and coordinate by passing messages.
core intermediate concept -
Microservices
An architectural style that builds an application as many small, independently-deployable services, each owning its own data and talking over a network.
core intermediate concept -
Replication
Keeping copies of the same data on multiple machines for availability, durability, and read scaling.
core intermediate concept -
Sharding
Splitting data across multiple machines by some key, so the working set per machine — and the request load — stays manageable as the system grows.
core intermediate concept -
Consensus
A protocol that lets a group of machines agree on a single value despite failures and network unreliability — the foundation under replicated databases and leader election.
core advanced concept
Important
What you'll meet next.-
Cloud Provider
A company that rents computing — servers, storage, networking, and managed services — on demand over the internet, so you don't have to build and run your own data center.
beginner concept -
Eventual Consistency
A consistency model where replicas of data are temporarily allowed to disagree, with the guarantee that they will converge once writes stop.
intermediate concept -
Kubernetes
An open-source container orchestrator that manages where containers run, how they're scaled, how they're networked, and how they're restarted when things break.
intermediate technology -
Load Balancer
A component that distributes incoming traffic across a pool of servers — hiding individual server failures, spreading work evenly, and making the fleet appear as a single endpoint to clients.
intermediate technology -
Message Queue
A durable middleman that lets services send and receive messages asynchronously — decoupling producers from consumers, smoothing load, and surviving outages.
intermediate technology -
Serverless
A deployment model where you write functions or small services and the platform handles provisioning, scaling, and idling them — you pay per-request, not per-server.
intermediate concept -
CAP Theorem
A foundational result of distributed systems: under a network partition, a system can be either Consistent or Available, not both.
advanced concept -
Service Mesh
An infrastructure layer that handles service-to-service communication for microservices — routing, retries, encryption, and observability — usually via sidecar proxies, so application code doesn't have to.
advanced technology
Supplemental
Niche, historical, or specialized.-
Actor Model
A concurrency model where independent actors communicate only through message passing — no shared memory, no locks — making it natural for fault-tolerant, distributed, and highly-concurrent systems.
supplemental intermediate concept -
Circuit Breaker
A fault-tolerance pattern that stops calling a failing dependency and returns a fast error instead — giving the dependency time to recover and preventing cascading failures across services.
supplemental intermediate concept -
Gossip Protocol
A decentralised information-spreading algorithm inspired by rumour propagation — each node periodically exchanges state with a few random neighbours until information reaches the whole cluster.
supplemental intermediate protocol -
Idempotency
An operation is idempotent if performing it multiple times has the same effect as performing it once — a property that makes retries safe and distributed systems far easier to reason about.
supplemental intermediate concept -
OpenMP
A set of compiler directives and library routines for shared-memory parallelism in C, C++, and Fortran — add a pragma above a loop and the compiler parallelises it across all cores on one machine.
supplemental intermediate technology -
Rate Limiting
Controlling how many requests a client or service can make in a time window — protecting backends from overload, preventing abuse, and enforcing fair use across tenants.
supplemental intermediate concept -
Saga Pattern
A pattern for managing long-lived distributed transactions by breaking them into a sequence of local transactions, each with a compensating action to undo it if a later step fails.
supplemental intermediate concept -
CRDT
A data structure designed so that any two replicas can be merged in any order and still arrive at the same result — enabling eventual consistency without coordination.
supplemental advanced concept -
MPI Basics
The standard API for parallel programs that run across many compute nodes, communicating by explicitly sending and receiving messages — the infrastructure behind most scientific supercomputing.
supplemental advanced technology -
Paxos
The foundational consensus algorithm that proved distributed agreement is possible in the face of message loss and crashes — notoriously hard to understand, but the theoretical backbone of nearly all consensus work.
supplemental advanced protocol -
Raft
A consensus algorithm designed to be understandable — it achieves distributed agreement through leader election and log replication, and has become the go-to replacement for Paxos in modern systems.
supplemental advanced protocol