Skip to content

AI governance: the policy layer above any individual deployment

Phase 3 has been building the policy and coordination layer above the deployment-time safety case that Phase 2 worked. L7 named the value-loading problem and the moral parliament; L8 supplied the formal vocabulary of collective action and the institutional-mechanism response; L9 takes the institutional mechanism and asks who designs it, who enforces it, and at what scale. The answer is the governance layer, and Hendrycks Chapter 8 works the question.

The chapter organizes the discussion around a four-layer taxonomy: corporate governance, national governance, international governance, and compute governance. Each layer operates at a different scope; each has its own mechanisms, its own enforcement instruments, its own characteristic failure modes. The L9 capability is operational: given a real-world governance proposal (an EU AI Act article, a US executive order section, a compute-cap proposal, a corporate Responsible Scaling Policy), situate it inside the taxonomy and defend the placement.

The Phase 0 §6 register applies most strongly here, because the topic invites normative framing more than any other in the track. The lesson presents Hendrycks’ four layers and the mechanisms within them as the chapter develops them; it does not advocate for any specific governance proposal or framework. Claims are attributed; the reader does the value-loading.

Each layer in the taxonomy is the answer to a different question.

  • Corporate governance answers: what does an individual organization building AI commit to, how does it commit, who internally enforces the commitment?
  • National governance answers: what does a sovereign jurisdiction require of AI organizations operating inside its borders, through what regulatory instruments, with what enforcement teeth?
  • International governance answers: how do sovereign jurisdictions coordinate when AI risks cross borders and unilateral action is insufficient?
  • Compute governance answers: how does the supply chain for the physical resource AI development depends on (chips, data centers, electricity) get governed?

The layers are not strictly hierarchical. A corporate policy commitment can be stronger or weaker than the national regulation that nominally constrains it; international coordination can outrun what any individual nation has enacted; compute governance cuts across all three because the chip-and-datacenter supply chain involves corporate operators, national-jurisdiction enforcement, and international supply-chain dependencies. The taxonomy is operationally useful precisely because real proposals usually involve multiple layers, and naming the layers makes the multi-layer structure visible.

The first layer of governance is what AI organizations commit to themselves and what their internal structures enforce. The current state of the practice includes several recognizable mechanisms.

Responsible Scaling Policies (RSPs) and analogous corporate commitments are the most-visible recent development. The general shape: the organization commits to specific safety measures (capability evaluations, red-teaming, deployment thresholds, kill-switch protocols) that scale with model capability. The organization publishes the policy; deviations from the policy are visible to external observers; the policy is supposed to bind the organization in the same way a public commitment binds any actor whose reputation depends on honoring it.

Internal safety teams and board oversight. Many AI organizations have established internal safety, alignment, or responsible-AI teams; some have established board-level committees with explicit safety mandates. The mechanism is organizational: certain decisions are routed through the safety function before deployment; the safety function has authority (in principle) to block deployment.

Model cards and capability disclosures. Public artifacts that document what a deployed model can do, what it was trained on, what failure modes have been catalogued. The transparency principle from L5’s safe-design list operationalized for AI deployment.

The structural limit of corporate governance is the same one collective-action analysis (L8) named: corporate commitments are unilateral, and a unilateral commitment can be undercut by competitor behavior. The race-to-the-bottom failure mode from L8 lives here directly. Corporate governance is necessary but not sufficient; it solves the within-organization piece and leaves the between-organization piece for the next layers.

The second layer is regulation by sovereign jurisdictions. The current state includes binding regulation (the EU AI Act, China’s generative AI regulations, several US state laws), executive orders (the 2023 US executive order on AI, subsequent revisions), and voluntary frameworks that become de-facto binding through procurement and certification pressure (the NIST AI Risk Management Framework, the UK AI Safety Institute’s evaluation standards).

The mechanisms operate through several familiar regulatory instruments: mandatory pre-deployment evaluation (the EU AI Act’s high-risk categorization), incident reporting requirements, licensing regimes for systems above capability thresholds, liability rules that attach to producers and deployers. The instruments are not new to law; they are adaptations of regulatory tools developed for medical devices, financial services, environmental regulation, and other prior-domain high-stakes deployment.

The structural limit of national governance is the jurisdictional one. A binding regulation in jurisdiction A does not constrain a developer in jurisdiction B that does not adopt similar regulation. The AI-race failure mode from L8 reappears at the national-jurisdiction layer: nations have incentive to attract AI development by maintaining lighter regulation, which constrains how strong any individual nation’s regulation can be without coordination. National governance addresses the within-jurisdiction piece and leaves the between-jurisdiction piece for international governance.

The third layer is coordination across sovereign jurisdictions. Hendrycks frames the layer directly: “international cooperation is important in order to manage risks from AI” (Hendrycks, CAIS, 2024, §8.6). The chapter draws explicit parallels to nuclear-weapons governance, noting that both AI and nuclear weapons are “offense-dominant” in the sense that individual deviations from agreements pose significant risks and both have identifiable supply-chain chokepoints (uranium for nuclear weapons, computing power for AI systems).

The mechanisms span the standard repertoire of international coordination: unilateral national commitments that become reciprocal, bilateral and multilateral negotiations, summits and forums for norm-development, formal treaty agreements, and dedicated international organizations modeled on the IAEA. The chapter also illustrates certification as a regulatory tool through the aviation analogy: “domestic regulators must have certain verification procedures” to maintain international airspace access, and the certification regime gives nations market-based incentives to maintain compliance even without direct international enforcement.

The chapter is honest about the verification asymmetry that AI inherits from the nuclear precedent. Nuclear weapons use is readily detectable, yet successful development is difficult to confirm; the same asymmetry applies to AI, possibly more strongly. An international treaty that constrains AI development capability is only as enforceable as the verification regime that detects violations. The chapter does not pretend the verification problem is solved; it names it as the open governance research question.

The fourth layer is the most recent addition to the field’s governance vocabulary and the one Hendrycks foregrounds. The argument: “Compute is indispensable for developing and deploying AIs. Restricting access to compute allows control over what AIs are created and used” (Hendrycks, CAIS, 2024, §8.7). The strategic case for compute as the lever rests on three properties: “Compute is physical, excludable, and quantifiable which allows it to be tracked, restricted, and measured” (Hendrycks, CAIS, 2024, §8.7).

The three properties matter operationally.

Physical. Compute is not an abstract resource. It lives in chips, in data centers, in physical supply chains with identifiable factories and shipping routes. This is what makes compute governance technically possible in ways that algorithm regulation or data regulation are not; you cannot regulate a software equation, but you can regulate a chip-fabrication facility.

Excludable. Compute can be restricted at the supplier-customer interface. A chip-fabrication facility can refuse to ship to specific buyers; a cloud provider can refuse to serve specific accounts; an export-control regime can restrict cross-border shipment of specific compute hardware. The excludability operates at every layer of the supply chain.

Quantifiable. Compute usage is measurable in standard units (FLOPs, floating-point operations per second, total training compute in FLOP-seconds). The quantifiability matters because regulatory thresholds need measurement bases; a regulation that says “models trained above 10^26 FLOPs require additional safety evaluation” is enforceable in a way that a regulation phrased around capability is not.

The specific mechanisms compute governance enables include compute reporting requirements (training-runs above a threshold must be disclosed), compute caps (cap the total training compute available for specific deployments or actors), chip export controls (restrict cross-border shipment of advanced AI chips), and cloud-provider KYC (cloud platforms verify customer identity before granting access to large compute allocations). Each is a different mechanism operating on the same underlying lever.

Compute governance also has limits. It depends on the supply chain remaining concentrated enough to be regulable; if compute production decentralizes substantially, the lever weakens. It depends on the FLOP threshold remaining a meaningful proxy for capability, which becomes less reliable as algorithmic efficiency improves. It depends on international coordination, because unilateral compute restrictions can be defeated by jurisdictional arbitrage. The chapter treats compute governance as the most-tractable current lever rather than the final answer.

You should now be able to take a real governance proposal and place it inside the four-layer taxonomy.

A recent example, worked: the EU AI Act’s requirement that providers of general-purpose AI models with systemic risk perform model evaluations, assess and mitigate systemic risks, report serious incidents, and ensure cybersecurity protection. Placement: primarily national governance (the EU as the sovereign jurisdiction), with corporate governance hooks (the evaluation and incident-reporting obligations land on individual providers), with implicit compute governance scaffolding (the “systemic risk” threshold is defined via training-compute proxy at 10^25 FLOPs). The proposal is multi-layer, and the multi-layer character is what makes it operationally complete: the national layer provides enforcement teeth, the corporate layer specifies the obligated party, the compute layer provides the measurable threshold.

A second example: a hypothetical international treaty restricting export of advanced AI chips to non-signatory jurisdictions. Placement: primarily international governance (treaty), with compute governance as the underlying lever (the regulated resource is chips), with national governance as the enforcement layer (each signatory enforces the export restriction through its own customs apparatus). The structure mirrors nuclear non-proliferation, which the chapter explicitly cites as the analogous precedent.

The capability is the placement, not the endorsement. The lesson takes no position on whether the EU AI Act or any specific compute-control proposal is the right policy. The capability is to read a proposal, see which layers it operates on, and predict how the proposal will interact with the other layers and with the collective-action dynamics from L8.

Nine lessons across three phases. Phase 1 (the risks landscape) gave you the field-framing (L1) and the four-bucket typology (L2): vocabulary to classify any AI-harm headline. Phase 2 (safety and alignment) worked the deployment-time safety case: monitoring and robustness (L3), the alignment substrate (L4), the safety-engineering toolkit (L5), the complex-systems constraints (L6). Phase 3 (ethics and governance) added the policy and coordination layer: moral uncertainty and value-loading (L7), collective action and multi-agent dynamics (L8), and the governance taxonomy (L9). Together, the nine lessons give you what Hendrycks’ textbook is built to give: a working vocabulary for a discipline that is not solved, attributed throughout, in a register that lets you contest any specific claim with the same vocabulary.

What the track does not give you: a position on whether AI development should slow down or speed up, a specific governance proposal to endorse, a settled ethical framework, or a guarantee that the safety case for any specific deployment will work. Those are not the track’s job. The track’s job is to give you the vocabulary; the value-loading is yours.

If a single closing thought is worth holding, the chapter’s own framing carries it: a safety case is a Swiss-cheese stack of partial defenses, the holes are largest where the field has the fewest tools, and the honest disposition is to keep filling holes rather than to declare any layer sufficient. The track ends where the field actually is: in motion, with vocabulary still accumulating, and with the next round of governance debates likely to look different from the ones in front of us today.