Summary: From a line to a probability: logistic regression
Logistic regression turns the line you already know into a classifier: it computes the same weighted sum of inputs, then squashes it through an S-shaped curve into a probability between 0 and 1. It is the simplest classifier and the opener of Phase 2, and it is the first model we actually fit with gradient descent. This summary is the scan version of the full lesson.
Core ideas
Section titled “Core ideas”- A straight line fails for yes/no. It is unbounded (predicts above 1 and below 0) and the wrong shape. A probability must live between 0 and 1.
- The fix is a squash. Compute
z = intercept + coefficient * feature(the linear part), then passzthrough the sigmoid, an S-curve mapping any number to 0..1. Very negative goes near 0, very positive near 1, zero to exactly 0.5. - The decision boundary is where probability = 0.5, which is exactly where
z = 0. Underneath the curve, logistic regression still draws a straight boundary between the classes. - A threshold turns probability into a decision. Default 0.5: predict yes if the probability is at least 0.5. The threshold is movable when error costs are unequal.
- It is fit by gradient descent, not a closed-form formula, minimizing a loss that punishes confident-wrong predictions hard.
- Coefficients read like before: a positive coefficient pushes the probability of “yes” up; size is strength.
- The name is misleading: logistic “regression” is a classifier.
What changes for you
Section titled “What changes for you”When a model reports it is “90 percent confident,” you now know where that number likely comes from: a score squashed through a sigmoid, exactly as in this lesson, just at the end of a much bigger network. You also gain a practical lever: the 0.5 threshold is a default, not a law. A fraud detector or a medical screen can dial it down to catch more true cases at the price of more false alarms, and that dial is the same idea the evaluation phase formalizes with precision and recall. The deeper takeaway is continuity: classification did not require throwing out regression, it reused the line and added a squash. The next lesson breaks from that lineage entirely, classifying by asking a sequence of questions instead of drawing a boundary.