Complex systems and emergent risk: cheatsheet

The four properties of complex systems

Property	What it means	Non-AI example	AI example
Emergence	System has properties no component does	Markets discover prices, no individual trader does	Networks represent concepts, no neuron does; multi-agent deployments produce population-level dynamics no model does
Nonlinearity	Small input changes produce large output changes in analytically intractable ways	Weather: hundredth-percent perturbation produces hurricane or nothing	Emergent capabilities: smooth scaling laws do not predict discontinuous thresholds
Feedback loops	Outputs feed back as inputs; mix of stabilizing and amplifying	Thermostat (stabilizing); microphone-speaker squeal (amplifying); markets (both, different timescales)	Recommendation systems shape preferences which become training data; model-generated content trains future models
Tight coupling	State of one part constrains others on sub-human-intervention timescales	Power grid: local failure propagates across regions within seconds	15-second shared-state stores; auto-trading at microsecond timescales; cascading failures across AI pipelines

The four are not a checklist; a system can have them in different combinations and degrees. Naming them surfaces failure modes that component-level engineering vocabulary does not.

Normal accident vs preventable engineering failure (Perrow framework)

	Preventable engineering failure	Normal accident
Cause	Component bug, operator error, design defect that better engineering would catch	System structure: tight coupling + interactive complexity make the accident class statistically inevitable
Component-level fix	Sufficient	Necessary but not sufficient
System-level fix	Optional	Required
Example	Ariane 5 inertial-reference overflow (component bug, AND a certification-process gap; partially normal)	Flash Crash 2010 (no component failed; structure produced failure); Northeast blackout 2003; Three Mile Island 1979
Recurrence prevention	Fix the specific bug	Change the structure: reduce coupling, increase loose coupling, break feedback loops, introduce circuit-breakers

Perrow: in tightly-coupled interactively-complex systems, no amount of component-level engineering can drive accident rates to zero.

Why L5’s Swiss-cheese composition breaks in real deployments

Failure mode for independence	What happens	Operational fix
Shared blind spots	Training and deployment use the same eval framework descended from the same team’s assumptions; both miss what the team did not think to test	Diverse eval methodologies; external red-teamers with different assumptions
Correlated failure modes	Multiple monitoring systems read the same logs; a log-pipeline failure takes them all down at once	Independent measurement infrastructure; diverse signal sources
Adversarial pressure	An attacker defeats successive layers with the same technique	Defenses based on different principles, not different implementations of the same principle

Operational rule: more layers do not help when the existing layers are correlated. Independence is the bottleneck.

AI-specific complex-systems patterns

Pattern	Mechanism	Where it lives
Tight coupling to environment	Models affect data, operator practices, user expectations, regulatory frameworks for the next generation	Recommendation systems, deployed agents in production
Multi-agent emergence	Population of AI systems exhibits behavior individual systems do not	Algorithmic markets, autonomous-vehicle traffic, multi-model web content
Emergent capabilities	Capabilities appear at certain scales discontinuously	Large language model scaling thresholds
Model monoculture	Shared underlying base model means correlated failures across products	Foundation models licensed widely; LLM API ecosystems

The L6 capability (four-step protocol)

For a deployed AI system you care about:

Name four complex-systems properties. Identify emergence, nonlinearity, feedback loops, and tight coupling in the specific deployment (or say which are absent and why).
Distinguish normal-accidents class from preventable-failure class. Which failure modes would a component-level fix resolve? Which are structural and require system-level changes?
Recognize Swiss-cheese-independence failures. For your current safety stack, identify which layers share blind spots, share infrastructure, or face adversaries that defeat all of them simultaneously.
Propose system-structure design changes. Two changes that reduce complex-systems risk without addressing any component-level bug. Target tight coupling, feedback dynamics, or interaction-level failure modes.

Quick disambiguation cheatsheet

When the question is	The L6 framing is usually
”Why did the system fail when each component worked?”	Normal accident / interaction-level failure
”Will more eval catch this?”	Probably no, if the holes are correlated
”Why does this surprise us?”	Likely an emergent property the component analysis missed
”Why does the same mitigation defeat multiple attackers?”	Layered-defenses-not-independent
”Why does the next model’s training distribution look like the previous one’s outputs?”	Feedback loop
”Why does a small change produce a big effect?”	Nonlinearity
”Why is the population behavior different from any individual model’s behavior?”	Emergence
”Why can’t I patch this with one fix?”	The risk is in the system structure, not a component

Cross-track and within-track pointers

L3 + L4 + L5 (the rest of Phase 2): L6 closes the phase by inverting L5’s independence assumption. The Phase 2 picture is now complete: failures (L3), substrate (L4), engineering toolkit (L5), system-structure constraints on the toolkit (L6).
L7 (ethics): opens Phase 3. The question shifts from “what fails” to “what are we trying to do,” which is itself a complex-systems question once you take seriously that the operator population has heterogeneous values.
L8 (collective action, Ch 7): takes the multi-agent dynamics L6 previewed and works them at full depth (game theory, cooperation, conflict, evolutionary pressures).
L9 (governance, Ch 8): brings governance as the layer outside any individual deployment that addresses model-monoculture risk and the coordination-instrument levers from L2’s AI-race bucket.