All Infographics
Supervised vs Unsupervised vs Reinforcement Learning infographic - Three Learning Paradigms Compared with Examples

Click image to open full size

Computer Science

Supervised vs Unsupervised vs Reinforcement Learning

Three Learning Paradigms Compared with Examples

Machine learning is often grouped into three major paradigms: supervised learning, unsupervised learning, and reinforcement learning. Each one solves a different kind of problem based on the type of data and feedback available. Understanding the differences helps students choose the right approach for tasks like prediction, pattern discovery, and decision making. These ideas are central to modern computer science because they power search engines, recommendation systems, robotics, and scientific analysis.

In supervised learning, a model learns from labeled examples so it can map inputs to known outputs. In unsupervised learning, the model receives unlabeled data and tries to find structure such as clusters, patterns, or compressed representations. In reinforcement learning, an agent interacts with an environment and improves by receiving rewards or penalties for its actions over time. The key distinction is the source of feedback: correct answers, hidden structure, or reward signals.

Key Facts

  • Supervised learning uses labeled data pairs (x, y) to learn a function y = f(x).
  • A common supervised objective is to minimize prediction error, such as MSE = (1/n) * sum((y - y_hat)^2).
  • Unsupervised learning uses inputs x without labels and often groups data by similarity, such as minimizing within-cluster distance in k-means.
  • In k-means clustering, each point is assigned to the nearest centroid and centroids are updated by the mean of assigned points.
  • Reinforcement learning models decision making with states, actions, rewards, and a policy pi(a|s).
  • A basic reinforcement learning return is G = r1 + gamma*r2 + gamma^2*r3 + ... where 0 <= gamma < 1.

Vocabulary

Label
A label is the correct output or target value attached to a training example in supervised learning.
Cluster
A cluster is a group of data points that are more similar to each other than to points in other groups.
Feature
A feature is a measurable property or input variable used by a model to describe each example.
Agent
An agent is the decision maker in reinforcement learning that chooses actions in an environment.
Reward
A reward is a numerical signal that tells the agent how good or bad an action outcome was.

Common Mistakes to Avoid

  • Treating supervised learning as if it works without labels, which is wrong because supervised models need known target outputs during training.
  • Assuming unsupervised learning predicts correct answers directly, which is wrong because it usually finds patterns or structure rather than labeled outcomes.
  • Thinking reinforcement learning gets feedback after every correct answer like a quiz, which is wrong because rewards can be delayed and depend on sequences of actions.
  • Choosing a learning type based only on algorithm popularity, which is wrong because the correct paradigm depends on whether the problem has labels, hidden structure, or reward-based interaction.

Practice Questions

  1. 1 A student builds a model to predict house price from floor area, number of rooms, and age of the house using past sales with known prices. Which learning paradigm is this, and what are the inputs and labels?
  2. 2 An agent receives rewards 5, 2, and 1 over three time steps with discount factor gamma = 0.5. Compute the return G = r1 + gamma*r2 + gamma^2*r3.
  3. 3 A dataset contains 120 customer records with purchase histories but no category labels. The goal is to group similar customers for marketing. Explain why unsupervised learning is more appropriate than supervised learning.