Practice: What machine learning actually is
Self-check
Section titled “Self-check”Seven short questions. Try to answer each one in your head (or on paper) before opening the collapsible. Active retrieval is where the learning sticks; rereading is comfortable but does much less.
1. In one sentence, what is the core flip that separates machine learning from traditional programming?
Show answer
In traditional programming you write the rules and the computer runs them. In machine learning you provide labeled examples and the machine infers the rules itself. You supply the data and the answers; it discovers the logic.
2. When is learning from data the better choice than writing a rule?
Show answer
When the rules are too many, too fuzzy, or unknown to you, and you have plenty of examples to learn from. “Is there a cat in this photo” cannot be written as a rule, but it can be learned from labeled photos. When the rule is simple and known, write the rule.
3. What single question splits supervised from unsupervised learning?
Show answer
Do your examples come with the right answer (a label) attached? If yes, it is supervised. If no, it is unsupervised. Labels are the dividing line.
4. Name the two flavors of supervised learning and what tells them apart.
Show answer
Regression, when the answer is a number (a price, a temperature), and classification, when the answer is a category (spam or not-spam, which digit). What separates them is the kind of answer being predicted, not the method.
5. Give two things unsupervised learning does, since it has no labels to predict.
Show answer
Clustering (grouping similar items together when nobody defined the groups in advance) and dimensionality reduction (compressing many features down to a few that still capture most of what matters). Both find structure rather than predict a known answer.
6. Give one situation where machine learning is the wrong tool.
Show answer
Any of: a simple known rule already works (computing sales tax), you have no data to learn from, or an unexplainable mistake is unacceptable and a statistical black box cannot be trusted. Reaching for machine learning reflexively is a real mistake.
7. What is the one rule that decides whether a model is any good?
Show answer
How it performs on new data it has never seen, not how well it fits the data it trained on. A model can “succeed” by memorizing its training examples and then fail completely on anything new. Generalization is the whole game.
Try it yourself: name the problem
Section titled “Try it yourself: name the problem”For each task below, decide: is it supervised (and if so, regression or classification), unsupervised, or neither (just a rule)? Use the two-question test: do I have labeled answers, and if so, is the answer a number or a category?
A. Estimate how many minutes a delivery will take, from distance and traffic.B. Sort a photo library into groups of similar-looking images, no categories given.C. Flag an email as phishing or safe.D. Convert a temperature from Celsius to Fahrenheit.E. Reduce a survey with 200 questions down to a handful of underlying themes.Show answer
- A: supervised, regression. The answer is a number (minutes), and you can learn from past deliveries with known times.
- B: unsupervised, clustering. No labels; the goal is to find groups that were not defined in advance.
- C: supervised, classification. The answer is a category (phishing / safe), learned from past labeled emails.
- D: neither. It is an exact formula (F = C times 9/5 plus 32). Write the rule; do not train a model.
- E: unsupervised, dimensionality reduction. Many features compressed into a few underlying themes, with no answer to predict.
The test that carries you through all five: first ask whether you have labeled answers, then ask what kind of answer it is. That single habit decides the whole shape of a machine learning project.
Flashcards
Section titled “Flashcards”Ten cards. Click any card to reveal the answer. Use the Print flashcards button for one card per page.
Q. What is the core flip behind machine learning?
You stop writing the rules. Instead you provide labeled examples and the machine infers the rules itself. The logic is discovered from data rather than written by you.
Q. When should you prefer learning from data over writing a rule?
When the rules are too many, too fuzzy, or unknown, and you have lots of examples. When the rule is simple and known, just write the rule.
Q. What single question separates supervised from unsupervised learning?
Do the examples come with the right answer (a label) attached? Yes means supervised; no means unsupervised.
Q. What are the two flavors of supervised learning?
Regression (the answer is a number, like a price) and classification (the answer is a category, like spam or not-spam).
Q. What does unsupervised learning do, given it has no labels?
It finds structure in unlabeled data: clustering (grouping similar items) and dimensionality reduction (compressing many features into a few).
Q. Name a third paradigm beyond supervised and unsupervised.
Reinforcement learning: an agent learns by trial and error against a reward signal. It sits outside this track’s classical canon.
Q. Give one case where machine learning is the wrong tool.
A simple rule already works, or you have no data to learn from, or an unexplainable mistake is unacceptable. A learned model is a pattern-finder, not a guarantee.
Q. What is the one rule that judges every model?
How it performs on data it has never seen, not how well it fits the data it trained on. Generalization is the whole game.
Q. What does it mean when a model 'memorizes' instead of learning?
It fits every quirk of the training data, looks perfect on those examples, and then fails on new data. It learned the noise, not the pattern.
Q. Does a machine learning model 'understand' the problem?
No. It finds statistical patterns, not meaning. That is enough to be useful, and enough to be occasionally and confidently wrong.