Cheatsheet: From a line to a probability: logistic regression
The idea in two steps
Section titled “The idea in two steps”| Step | Formula | Role |
|---|---|---|
| 1. Linear part | z = intercept + coefficient * feature | same weighted sum as linear regression |
| 2. Squash | probability = sigmoid(z) | bend z into the range 0 to 1 |
The sigmoid (S-curve)
Section titled “The sigmoid (S-curve)”| Input z | Output (probability) |
|---|---|
| very negative | near 0 |
| -2 | about 0.12 |
| 0 | exactly 0.50 |
| 2 | about 0.88 |
| very positive | near 1 |
Decision and boundary
Section titled “Decision and boundary”| Concept | Rule |
|---|---|
| Decision (threshold 0.5) | probability >= 0.5 -> yes, else no |
| Equivalent test | z >= 0 -> yes (same thing) |
| Decision boundary | where probability = 0.5, i.e. where z = 0 |
| Boundary shape | a straight line or flat surface |
Worked example (pass an exam from hours studied)
Section titled “Worked example (pass an exam from hours studied)”| hours | z = -4 + hours | probability | decision |
|---|---|---|---|
| 2 | -2 | ~0.12 | FAIL |
| 4 | 0 | 0.50 | on the fence (boundary) |
| 6 | 2 | ~0.88 | PASS |
Fitting and coefficients
Section titled “Fitting and coefficients”| Item | Note |
|---|---|
| How it is fit | gradient descent (no closed-form formula) |
| Loss | rewards confident-correct, punishes confident-wrong |
| Positive coefficient | pushes probability of “yes” up |
| Negative coefficient | pushes probability of “yes” down |
Pitfalls
Section titled “Pitfalls”| Pitfall | Reality |
|---|---|
| ”Regression predicts a number” | it is a classifier; output is a probability |
| Probability is exact truth | it is a model estimate; can be confidently wrong |
| 0.5 is the only threshold | move it when error costs are unequal |
| Boundary can curve | it is straight; curved data needs engineered features |