Skip to content

Cheatsheet: The human-centered view

In scopeOut of scope
Engineering side: failure modes, bias as training-data property, measurement and mitigation as design choicesPolicy debates: what to permit, regulate, restrict; ethical-theory disputes
Treats bias and trustworthiness as measurable design problemsBelongs in different forums with legal / ethics / regulatory stakeholders
Sharpens policy debates with measurable inputsDoes not replace policy decisions
Failure modeMechanismEngineering response
Distribution shiftDeployment data differs from trainingBroader curation; domain adaptation; domain randomization; monitoring
Adversarial examplesTiny crafted perturbations flip predictionsAdversarial training; certified robustness; input validation
Out-of-distribution (OOD) inputsInput genuinely outside training; model confidently wrongCalibrated confidence; OOD detection heads; ensemble disagreement
Shortcut learningModel latches onto spurious correlation (wolf-vs-husky-snow)Dataset curation; augmentation; eval splits that test shortcut-reliance
Calibration / overconfidenceConfidence scores misaligned with actual accuracyTemperature scaling; isotonic regression; deep ensembles
OriginMechanism
Training data compositionModel fits data’s statistical structure; uneven data → uneven model
Web-scraped datasetsReflect demographics / geographies / contexts of the web, which are not uniform
Different data, different bias profilesThis is an engineering claim, not a permanent property
Gender Shades 2018Disaggregated face-detection accuracy by skin tone + gender; error rates several times higher for darker-skinned women than lighter-skinned men; traced to training-set skew
PracticeWhat it does
Per-group accuracy reportingReveals sub-group disparities aggregate hides
Multi-attribute breakdownPer demographic, geographic, contextual sub-population simultaneously
Audits (Gender Shades style)Pre-deployment systematic measurement
Datasheets for datasets (Gebru et al. 2018)Standardize documenting composition + intended use

Aggregate 920/1,000 = 92.0%. Per-group:

GroupAccuracy
A98.8%
B98.0%
C72.0%
D99.2%

Aggregate hides 26-point gap; group C is the engineering concern.

CategoryExamples
Data-sideBalanced curation (Inclusive Images, FairFace); targeted collection for underrepresented sub-groups; datasheets for datasets
Model-sideAdversarial debiasing; loss reweighting by sub-group; multi-task with fairness-aware auxiliaries
Evaluation-sideDisaggregated reporting; stress-test sets probing weak sub-groups; pre-deployment audits
GapDetail
Benchmark vs realityTest set ≠ production; drift, edge cases, sub-group disparities surface in deployment
Closed byMonitoring; calibration; human-in-the-loop; explicit failure-mode plans
What to monitorPer-group accuracy over time; OOD-input rate; confidence distribution; downstream metrics
Calibration mattersLets system know when to defer (human escalation, refuse to act)
LayerPurpose
Trained modelThe base classifier / detector / generator
Data pipelineProduction input flow; pre-processing
Evaluation suitePre-deployment + ongoing
Monitoring dashboardDrift detection in production
CalibrationMake confidence usable for deferral
Human-review queueRoute uncertain predictions to people
Failure-mode planRollback procedure; graceful degradation; escalation chain
PitfallReality
Bias is one-and-doneProperty of data + model + eval; shifts as any shift; multi-dimensional; continuous measurement is the posture
High test accuracy = guaranteeDistribution shift, shortcut learning, calibration, tail events live in the gap
”Trustworthy AI” is one problemSeveral problems with different mitigations: distribution shift, robustness, fairness, OOD, calibration, interpretability
Engineering = policyEngineering treats measurable design; policy decides acceptability. Both needed; this lesson is only the engineering view
PhaseLessonsTheme
1: FoundationsL1-L4General-purpose image classifier
2: How machines seeL5-L9Vision-specific architecture (conv, CNNs, sequence, detection, video)
3: Generating and grounding visionL10-L15Harder tasks (self-supervised, GAN/VAE, diffusion, 3D, vision+language, world modeling)
CloseL16Engineering-side human-centered view
To go deeperTrack
Foundations reviewT11 (Neural Network Intuition)
Attention + transformers (in depth)T5 (AI Foundations)
Transformer mechanicsT14 (planned, Practical Transformers)
Model-based RLT18 (planned, Reinforcement Learning)
ELBO + score-based derivationsT19 (planned, Generative Modeling)
Production text-to-image + multimodal + videoT24 (planned, Image Generation and Multimodal)

Vision systems work mechanically AND fail mechanically; the engineering view treats failure modes, bias, and trustworthiness as measurable design problems with engineering responses (data curation, calibration, monitoring, sub-group reporting, human-in-the-loop, failure plans), while leaving policy debates to the right stakeholders.