References: complex systems and emergent risk

Primary source

Dan Hendrycks. Introduction to AI Safety, Ethics, and Society. Taylor & Francis, 2024. Center for AI Safety, free to read at aisafetybook.com. L6 draws from Chapter 5 (Complex Systems), specifically Section 5.2 (Introduction to Complex Systems) and Section 5.3 (Complex Systems for AI Safety).

Chapter section	Topic	URL
Ch 5.2	Introduction to Complex Systems	aisafetybook.com/textbook/introduction-to-complex-systems
Ch 5.3	Complex Systems for AI Safety	aisafetybook.com/textbook/complex-systems-for-ai-safety

Attribution posture for L6

This lesson uses framework-level attribution to Hendrycks Chapter 5 throughout. No verbatim quotes from Chapter 5 appear in the lesson body, because at draft time the chapter-body content was not reliably retrievable for verbatim verification under the structural-mirror posture and the A1 verbatim-discipline standard. The lesson’s claims about the chapter’s framing (the four complex-systems properties, the AI-tightly-coupled-to-environment argument, the multi-agent-emergence and emergent-capabilities arguments, the chapter’s recommendation that component-level engineering is necessary but not sufficient) are paraphrased from the chapter’s structure and are anchored against the Perrow normal-accident-theory lineage that the chapter explicitly draws on.

If a future revision of this lesson is able to retrieve and verify verbatim text from Chapter 5, those quotes should be inserted with section anchors and the framework-level paraphrases tightened to match.

Posture and license

Same posture as L1 through L5: the CAIS textbook is © 2026 Center for AI Safety, published by Taylor & Francis, free to read online with no explicit Creative Commons or reuse license. This lesson is a structural mirror with framework-level attribution to the chapter; no embedded text from the source, no derivative-quote runs beyond fair-use snippets.

The literature Hendrycks Chapter 5 draws on

These are the foundational works that the chapter’s framing inherits from. Reading any of them deepens the lesson; none are required for L6.

Charles Perrow, Normal Accidents: Living with High-Risk Technologies (Princeton University Press, 1984; revised edition 1999). The foundational text for the framing this lesson uses. Perrow argued that in tightly-coupled, interactively-complex systems (nuclear plants, petrochemical plants, marine transport, air traffic, dams), accidents arise from system structure rather than from component bugs or operator error. The book worked the framing through case studies of Three Mile Island, marine collisions, dam failures, and DNA recombinant research. The revised edition added Bhopal and Chernobyl. The book is dated in specifics but the framing has not been improved on. Available widely in libraries.
Karl E. Weick and Kathleen M. Sutcliffe, Managing the Unexpected: Sustained Performance in a Complex World (Jossey-Bass, 3rd edition 2015). The canonical treatment of High Reliability Organizations (HROs), the inverse of Perrow’s framework: how some organizations operating in tightly-coupled complex systems (nuclear aircraft carriers, air-traffic-control facilities) achieve very low accident rates by adopting specific operational practices. The L5 cheatsheet mentioned HROs; Weick and Sutcliffe are the source. Useful for the antifragility design principle.
Sidney Dekker, Drift into Failure: From Hunting Broken Components to Understanding Complex Systems (CRC Press, 2011). A more recent treatment that updates Perrow’s framework with explicit attention to how organizations drift into accident-prone configurations over time without anyone noticing. The “drift” framing is particularly useful for the feedback-loop discussion in this lesson, because organizational drift is itself a feedback-loop phenomenon.
Yaneer Bar-Yam, Making Things Work: Solving Complex Problems in a Complex World (NECSI Knowledge Press, 2004). A broader treatment of complex-systems thinking applied to policy and design problems. Less safety-engineering-flavored than Perrow but useful for the emergence and nonlinearity properties.

Multi-agent AI literature (foreshadowing L8)

L8 will work multi-agent dynamics at full depth from Hendrycks Chapter 7 (Collective Action Problems). These are entry points if you want to read ahead.

Allan Dafoe, et al., “Cooperative AI: machines must learn to find common ground” (Nature 2021), at nature.com/articles/d41586-021-01170-0. A short commentary that frames the cooperative-AI research agenda. Useful as a primer for the agenda L8 builds on.
Robert Axelrod, The Evolution of Cooperation (Basic Books, 1984; revised edition 2006). The foundational text on iterated-prisoner’s-dilemma and the emergence of cooperative strategies. Pre-AI but the framing transfers directly to multi-agent AI systems.
The MARL (Multi-Agent Reinforcement Learning) literature has grown substantially since 2020. Vinitsky, Sukhbaatar, and colleagues have produced a series of papers on emergent multi-agent behavior in MARL settings; the entry point is the Multi-Agent Particle Environments line. The L8 references file will return to this in more detail.

What L7 builds on from here

L7 enters Chapter 6 (Beneficial AI and Machine Ethics) and turns from “what fails” to “what are we trying to do?” The complex-systems framing carries forward: once you take seriously that the operator population has heterogeneous values and that those values themselves shift under feedback from deployed systems, the ethics question becomes a multi-stakeholder coordination problem rather than a single-designer specification problem. L8 takes the multi-agent dynamics at full depth; L9 brings governance. Phase 3 closes the track.