Skip to content

References: safety engineering for AI systems

Dan Hendrycks. Introduction to AI Safety, Ethics, and Society. Taylor & Francis, 2024. Center for AI Safety, free to read at aisafetybook.com. L5 draws from Chapter 4 (Safety Engineering), primarily sections 4.3, 4.4, and 4.7; the broader Chapter 4 also covers Risk Decomposition (4.2), Component Failure Accident Models and Methods (4.5), and Systemic Factors (4.6), which inform the lesson framing.

Chapter sectionTopicURL
Ch 4.3Nines of Reliabilityaisafetybook.com/textbook/nines-of-reliability
Ch 4.4Safe Design Principlesaisafetybook.com/textbook/safe-design-principles
Ch 4.7Tail Events and Black Swansaisafetybook.com/textbook/tail-events-and-black-swans

A1 discipline preserved: verbatim from the cited section, no paraphrasing inside quote marks.

  • §4.3 Nines of Reliability, definition: “a system’s nines of reliability indicate the number of consecutive nines at the beginning of its percentage or decimal reliability.”
  • §4.3 Nines of Reliability, scaling property: “an additional nine of reliability means a tenfold increase in expected lifespan.”
  • §4.4 Safe Design Principles, anchor: “There are multiple features we can build into a system from the design stage to make it safer.”
  • §4.7 Tail Events, core framing: “rare and highly extreme events…can dominate the overall expected impact from risks.”
  • §4.7 Tail Events, on prediction difficulty: “we do lack evidence to predict when they will happen or what precise form they will take.”

The eight safe-design principles in the lesson body and cheatsheet are paraphrased from the chapter’s structure; the chapter-example pairings (suspension-bridge cables for redundancy, cockpit crew protocols for separation of duties, etc.) follow Hendrycks’ own pairings in §4.4.

Same posture as L1 through L4: the CAIS textbook is © 2026 Center for AI Safety, published by Taylor & Francis, free to read online with no explicit Creative Commons or reuse license. This lesson is a structural mirror with verbatim quotes anchored to specific chapter sections within fair-use limits, link-out only, no embed, no derivative runs beyond fair-use snippets.

These are not required for L5; they are the cross-disciplinary literature Hendrycks reaches into in Chapter 4.

  • Normal Accident Theory: Charles Perrow, Normal Accidents: Living with High-Risk Technologies (1984; revised 1999). The foundational text on why complex, tightly-coupled systems produce accidents that are statistically inevitable rather than preventable through better engineering alone. Perrow’s framing is the explicit ancestor of L6 (complex systems); reading him before L6 makes the chapter feel less like new vocabulary.
  • High Reliability Organizations: Karl Weick and Kathleen Sutcliffe, Managing the Unexpected: Sustained Performance in a Complex World (2007). The canonical treatment of HROs (nuclear plants, aircraft carriers, air traffic control) and the operational practices they share. Hendrycks references HROs implicitly in §4.6 (Systemic Factors); Weick and Sutcliffe are the source.
  • Tail Risk and Black Swans: Nassim Nicholas Taleb, The Black Swan (2007). The popular-press anchor for the long-tailed-distribution framing the chapter uses in §4.7. Taleb is contested on some points (his recommendations are controversial); the descriptive framing of long-tailed distributions vs thin-tailed ones is widely accepted and is what the chapter borrows.
  • Defense in Depth in Computer Security: the OWASP and NIST treatments of defense in depth as a layered-security principle are the source for the principle’s translation into software systems; the same logic transfers to AI deployment stacks. NIST SP 800-53 is the bureaucratic-but-comprehensive reference at csrc.nist.gov/publications.
  • FMEA (Failure Mode and Effects Analysis): the SAE J1739 standard is the canonical industrial reference for FMEA practice; for a more accessible introduction, the Iowa State Center for Excellence in Logistics and Distribution has an open-access primer. FMEA is mentioned briefly in §4.5 (Component Failure Accident Models and Methods) and is the right next tool to learn after the eight design principles.
  • Swiss Cheese Model: James Reason, Human Error (1990), and Managing the Risks of Organizational Accidents (1997). Reason is the source of the Swiss-cheese model the lesson and cheatsheet rely on. His framing is healthcare-flavored but generalizes; medical-error literature has decades of worked applications.

L6 enters Chapter 5 (Complex Systems) and addresses the failure mode L5 acknowledges but does not work in detail: systems built from correct components can still fail at the system level because of interactions the component-level reasoning does not capture. The Perrow reading is the on-ramp; the L5 Swiss-cheese composition is the lesson L6 inverts by asking what happens when the layers stop being independent. Phase 2 closes at L6; Phase 3 (ethics and governance) opens at L7.