Skip to content

Cheatsheet: complex systems and emergent risk

PropertyWhat it meansNon-AI exampleAI example
EmergenceSystem has properties no component doesMarkets discover prices, no individual trader doesNetworks represent concepts, no neuron does; multi-agent deployments produce population-level dynamics no model does
NonlinearitySmall input changes produce large output changes in analytically intractable waysWeather: hundredth-percent perturbation produces hurricane or nothingEmergent capabilities: smooth scaling laws do not predict discontinuous thresholds
Feedback loopsOutputs feed back as inputs; mix of stabilizing and amplifyingThermostat (stabilizing); microphone-speaker squeal (amplifying); markets (both, different timescales)Recommendation systems shape preferences which become training data; model-generated content trains future models
Tight couplingState of one part constrains others on sub-human-intervention timescalesPower grid: local failure propagates across regions within seconds15-second shared-state stores; auto-trading at microsecond timescales; cascading failures across AI pipelines

The four are not a checklist; a system can have them in different combinations and degrees. Naming them surfaces failure modes that component-level engineering vocabulary does not.

Normal accident vs preventable engineering failure (Perrow framework)

Section titled “Normal accident vs preventable engineering failure (Perrow framework)”
Preventable engineering failureNormal accident
CauseComponent bug, operator error, design defect that better engineering would catchSystem structure: tight coupling + interactive complexity make the accident class statistically inevitable
Component-level fixSufficientNecessary but not sufficient
System-level fixOptionalRequired
ExampleAriane 5 inertial-reference overflow (component bug, AND a certification-process gap; partially normal)Flash Crash 2010 (no component failed; structure produced failure); Northeast blackout 2003; Three Mile Island 1979
Recurrence preventionFix the specific bugChange the structure: reduce coupling, increase loose coupling, break feedback loops, introduce circuit-breakers

Perrow: in tightly-coupled interactively-complex systems, no amount of component-level engineering can drive accident rates to zero.

Why L5’s Swiss-cheese composition breaks in real deployments

Section titled “Why L5’s Swiss-cheese composition breaks in real deployments”
Failure mode for independenceWhat happensOperational fix
Shared blind spotsTraining and deployment use the same eval framework descended from the same team’s assumptions; both miss what the team did not think to testDiverse eval methodologies; external red-teamers with different assumptions
Correlated failure modesMultiple monitoring systems read the same logs; a log-pipeline failure takes them all down at onceIndependent measurement infrastructure; diverse signal sources
Adversarial pressureAn attacker defeats successive layers with the same techniqueDefenses based on different principles, not different implementations of the same principle

Operational rule: more layers do not help when the existing layers are correlated. Independence is the bottleneck.

PatternMechanismWhere it lives
Tight coupling to environmentModels affect data, operator practices, user expectations, regulatory frameworks for the next generationRecommendation systems, deployed agents in production
Multi-agent emergencePopulation of AI systems exhibits behavior individual systems do notAlgorithmic markets, autonomous-vehicle traffic, multi-model web content
Emergent capabilitiesCapabilities appear at certain scales discontinuouslyLarge language model scaling thresholds
Model monocultureShared underlying base model means correlated failures across productsFoundation models licensed widely; LLM API ecosystems

For a deployed AI system you care about:

  1. Name four complex-systems properties. Identify emergence, nonlinearity, feedback loops, and tight coupling in the specific deployment (or say which are absent and why).
  2. Distinguish normal-accidents class from preventable-failure class. Which failure modes would a component-level fix resolve? Which are structural and require system-level changes?
  3. Recognize Swiss-cheese-independence failures. For your current safety stack, identify which layers share blind spots, share infrastructure, or face adversaries that defeat all of them simultaneously.
  4. Propose system-structure design changes. Two changes that reduce complex-systems risk without addressing any component-level bug. Target tight coupling, feedback dynamics, or interaction-level failure modes.
When the question isThe L6 framing is usually
”Why did the system fail when each component worked?”Normal accident / interaction-level failure
”Will more eval catch this?”Probably no, if the holes are correlated
”Why does this surprise us?”Likely an emergent property the component analysis missed
”Why does the same mitigation defeat multiple attackers?”Layered-defenses-not-independent
”Why does the next model’s training distribution look like the previous one’s outputs?”Feedback loop
”Why does a small change produce a big effect?”Nonlinearity
”Why is the population behavior different from any individual model’s behavior?”Emergence
”Why can’t I patch this with one fix?”The risk is in the system structure, not a component
  • L3 + L4 + L5 (the rest of Phase 2): L6 closes the phase by inverting L5’s independence assumption. The Phase 2 picture is now complete: failures (L3), substrate (L4), engineering toolkit (L5), system-structure constraints on the toolkit (L6).
  • L7 (ethics): opens Phase 3. The question shifts from “what fails” to “what are we trying to do,” which is itself a complex-systems question once you take seriously that the operator population has heterogeneous values.
  • L8 (collective action, Ch 7): takes the multi-agent dynamics L6 previewed and works them at full depth (game theory, cooperation, conflict, evolutionary pressures).
  • L9 (governance, Ch 8): brings governance as the layer outside any individual deployment that addresses model-monoculture risk and the coordination-instrument levers from L2’s AI-race bucket.