Data and Databases
Storing, querying, and modelling data — relational and non-relational databases, query languages, and data engineering.
Data and databases cover how information is stored, indexed, queried, and modelled — from SQL and the relational model to document stores, key-value stores, and analytics pipelines.
Core
The essentials. Start here.-
Database
An organised collection of data with a system around it that lets you store, retrieve, update, and query it efficiently.
core beginner concept -
ACID Transactions
A transaction model that guarantees Atomicity, Consistency, Isolation, and Durability — the contract that keeps databases reliable under failures and concurrency.
core intermediate concept -
Indexing
Auxiliary data structures a database maintains so it can answer queries without scanning every row — the single biggest knob for query performance.
core intermediate concept -
Normalization
A discipline for arranging tables to eliminate redundancy and update anomalies — the design counterpart to the relational model.
core intermediate concept -
Relational Model
A model of data as a set of relations (tables of tuples) — the mathematical foundation under SQL databases.
core intermediate concept -
SQL
A declarative language for querying and manipulating relational databases — you say what you want, the database figures out how.
core intermediate language
Important
What you'll meet next.-
Key-Value Store
A database that maps keys to values with O(1) lookup — the simplest possible data store, and the most-deployed.
beginner technology -
Data Warehouse
A database optimized for analytics — large-scale querying and aggregation over historical data from across an organization — rather than for the fast small transactions of an operational database.
intermediate concept -
Document Store
A NoSQL database that stores data as self-contained documents (usually JSON), each holding nested fields and arrays, instead of rows split across relational tables.
intermediate concept -
ETL
Extract, Transform, Load — the process of pulling data out of source systems, reshaping and cleaning it, and loading it into a destination like a data warehouse for analysis.
intermediate concept -
NoSQL
An umbrella term for databases that don't follow the relational model — key-value stores, document stores, wide-column stores, graph databases.
intermediate concept -
ORM
A library that maps rows in a relational database to objects in your programming language, so you can query with code instead of SQL strings.
intermediate tool -
Query Plan
The step-by-step program a database derives from your SQL query before executing it — the lever you use to diagnose and fix slow queries.
intermediate concept -
Schema Migration
The disciplined, versioned process of changing a database's structure — adding columns, tables, or indexes — over time, in step with the application, without losing data or downtime.
intermediate concept
Supplemental
Niche, historical, or specialized.-
Columnar Store
A database that stores each column's values contiguously rather than each row — enabling far faster analytical queries that aggregate millions of rows but touch only a few columns.
supplemental intermediate concept -
Graph Database
A database that stores data as nodes and edges, optimising for traversal queries that follow relationships — where a relational join chain would be slow, a graph traversal is fast.
supplemental intermediate technology -
MVCC
A concurrency technique where writes create new versions of rows rather than overwriting them — readers see a consistent snapshot of the past while writers proceed concurrently, with no read/write blocking.
supplemental intermediate concept -
Time-Series Database
A database optimised for storing and querying sequences of timestamped values — metrics, sensor readings, financial ticks — with compression and downsampling built in for the patterns time-series data exhibits.
supplemental intermediate technology -
Vector Database
A database optimised for storing and searching high-dimensional embedding vectors by similarity — the storage layer behind semantic search and retrieval-augmented generation.
supplemental intermediate technology -
Write-Ahead Log
A durability mechanism where every change is written to an append-only log before it is applied to the main data files — if the system crashes, the log is replayed to recover to the last committed state.
supplemental intermediate concept