Supervised Learning
Learning a function from labelled examples — the most widely-deployed flavour of machine learning.
- Primary domain
- Machine Learning
- Sub-category
- Supervised & Unsupervised Learning
In simple terms
In supervised learning, you show a model a bunch of examples paired with the correct answer (“this picture is a cat”, “this email is spam”) and it learns to predict the answer on new examples it hasn’t seen. The word “supervised” refers to the labels — a teacher told the model what each example was.
More detail
The training loop:
- Collect a dataset of input/output pairs.
- Choose a model architecture (linear, tree, neural net, …).
- Define a loss function measuring how wrong the model is.
- Iteratively adjust the model’s parameters to reduce the loss — usually with gradient descent.
- Evaluate on a held-out test set the model never saw during training.
Two main flavours:
- Classification — predict one of a fixed set of categories. Cat vs. dog. Spam vs. not. Digit 0–9.
- Regression — predict a real number. Tomorrow’s temperature. The selling price of a house.
Common pitfalls:
- Overfitting — the model memorises the training set and performs poorly on new data.
- Data leakage — information from the test set sneaks into training (e.g. preprocessing on the combined set).
- Imbalanced classes — accuracy is misleading when 99% of examples are one class.
- Distribution shift — production data differs from training data; performance silently degrades.
Modern industrial ML is overwhelmingly supervised: search ranking, recommendations, ad targeting, image moderation, machine translation, code completion. Even unsupervised techniques (like training a language model on raw text) are usually followed by supervised fine-tuning.
Why it matters
Most “AI that works” is supervised learning at scale: huge labelled datasets, well-understood architectures, careful evaluation. The gap between an interesting prototype and a useful product is almost always a labels and evaluation problem.
Real-world examples
-
An email spam filter is a binary classifier trained on millions of labelled emails.
-
A medical imaging model learns “tumour vs. healthy tissue” from radiologist annotations.
-
A coding assistant is fine-tuned on accepted vs. rejected completions.
-
Labeling is now an industry: companies like Scale AI and Surge employ tens of thousands of human raters to produce the labels for everything from autonomous driving to LLM fine-tuning.
Common misconceptions
- “Supervised learning needs perfect labels.” It tolerates noisy labels surprisingly well, especially at scale. It is much less tolerant of biased labels.
- “Bigger model is the answer.” Often more or better-labelled data is. Architecture matters less than people assume.
Learn next
How a trained model is actually used at run time: training and inference. The dominant model family for supervised learning: neural networks.
Read this in a learning path
All paths →This topic is part of a learning path. Start in context to keep prev/next and progress tracking.
Relationships
- Requires
- Related
- Leads to
- Required by
Neighborhood
A visual companion to the relationships above. Click any node to visit that topic.