Practice: AI governance

Exercise 1: situate three governance proposals in the four-layer taxonomy

For each proposal, name the primary layer (corporate, national, international, compute) and any secondary layers it operates on. Give one sentence on how the layers interact. Answers below; do the exercise first.

The proposals are stated in summary form; some are real-world (EU AI Act provisions, the US 2023 executive order, the Bletchley Declaration), some are realistic but composite.

A frontier-AI lab publishes a Responsible Scaling Policy committing to specific capability-evaluation thresholds, deployment pauses if a model crosses a defined dangerous-capability red line, and external red-teaming before any deployment at frontier scale. The policy is voluntary and self-enforced.
The European Union enacts an AI Act requiring providers of general-purpose AI models with systemic risk to perform model evaluations, assess and mitigate systemic risks, report serious incidents, and ensure cybersecurity protection. The systemic-risk threshold is defined via training-compute proxy at 10^25 FLOPs.
A multilateral agreement among twelve nations restricts export of advanced AI chips (above a defined compute-density threshold) to non-signatory jurisdictions. Each signatory enforces the export restriction through its own customs apparatus. The agreement includes a verification regime modeled on the IAEA’s safeguards inspections.

Answer key

Corporate governance, primary. RSPs are unilateral corporate commitments. No national-layer enforcement is invoked. Layer interaction: a corporate commitment is undercut by competitor behavior under L8’s race-to-the-bottom dynamic, so a corporate-only governance posture is necessarily partial; the policy’s strength depends on whether the publishing lab’s competitors adopt similar commitments. Reputation pressure is the only enforcement mechanism.
National governance, primary; corporate + compute secondary. The EU is the sovereign jurisdiction; the obligations land on individual providers (corporate); the systemic-risk threshold operates through a compute proxy (compute). Layer interaction: the national layer provides enforcement teeth via market-access (compliance is the price of EU operation), the corporate layer specifies the obligated party, the compute layer provides the measurable threshold that makes the obligation enforceable.
International governance, primary; compute + national secondary. A treaty is the international instrument; the regulated resource is chips (compute); each signatory enforces through its own customs apparatus (national). Layer interaction mirrors nuclear non-proliferation: international coordination establishes the constraint, compute identifies the resource, national jurisdictions provide the enforcement infrastructure. The verification regime is the critical layer; without it the treaty is unenforceable.

Exercise 2: design a multi-layer governance stack

You are advising a national legislature on the governance approach for a new class of autonomous-decision AI systems deployed in employment screening (resume review, interview scoring, hiring recommendations). Design a governance stack using at least three of the four layers. For each layer, name the specific mechanism, what it enforces, and how it interacts with the other layers in your stack.

Write your design as: (a) the deployment in scope, (b) the layers used, (c) the specific mechanism per layer, (d) the interaction story.

Example design (three-layer stack)

Deployment in scope: AI-driven employment screening for organizations above 250 employees operating in the jurisdiction.

Corporate layer: mandatory model-card publication by the AI provider, covering training data composition, evaluation methodology, and named limitations on demographic-subgroup performance. The card is updated quarterly. Mechanism: transparency-based; enforced by published-vs-actual auditability.

National layer: licensing requirement for employment-screening AI systems above a population-impact threshold (e.g., systems used to screen more than 10,000 applicants per year). License conditions include third-party fairness audit, incident-reporting obligation, and a private right of action for applicants who suspect discriminatory automated decisioning. Mechanism: regulatory; enforced by license-revocation and litigation.

Compute layer not used in this stack: employment-screening systems do not depend on frontier-scale compute, so the compute lever does not apply directly.

International layer: a coordination mechanism across major employment-jurisdictions to harmonize licensing standards, reducing race-to-the-bottom incentives. Mechanism: mutual-recognition; a system licensed in one jurisdiction is presumptively licensed in others, conditional on standard equivalence.

Interaction: the corporate transparency layer informs the third-party audits required by the national licensing layer; the international coordination prevents jurisdictional arbitrage. The three layers compose into a Swiss-cheese stack; no layer is sufficient alone (a corporate model card without audit teeth is weak; national licensing without international coordination invites jurisdictional flight; international coordination without enforcement teeth is symbolic).

Exercise 3: identify the verification challenge

The chapter notes a verification asymmetry inherited from the nuclear precedent: nuclear weapons use is readily detectable, yet successful development is difficult to confirm. AI inherits a similar asymmetry. Pick one of the three Exercise 1 proposals and write one paragraph (3-5 sentences) describing the specific verification challenge the proposal faces. The verification challenge is: (a) what would a violation look like, (b) how would the regulator detect the violation, (c) what makes detection hard.

Example (compute-export multilateral)

Verification challenge for the chip-export multilateral: a violation looks like a quantity of advanced AI chips reaching a non-signatory jurisdiction through a transshipment route the export-control regime did not catch. The regulator detects the violation through customs intelligence, intelligence-agency reporting, and (in the most-developed verification regimes) on-site inspection at the receiving facility. Detection is hard because the chip supply chain has many possible transshipment routes through intermediate jurisdictions, because end-use verification at the receiving site requires either consent or covert intelligence collection, and because the chip-to-capability mapping shifts as algorithmic efficiency improves (a chip that was below the threshold last year may be above-equivalent through software improvements). The chapter is honest that the verification problem is the open governance research question; the treaty’s effectiveness is bounded by the verification regime that backs it.

Flashcards

Q. What are the four layers in Hendrycks' governance taxonomy?

Corporate governance (what an individual AI organization commits to and internally enforces), national governance (what a sovereign jurisdiction requires through regulation), international governance (how sovereign jurisdictions coordinate across borders), compute governance (how the physical-resource supply chain underlying AI development gets governed). The layers are not strictly hierarchical; real proposals usually operate on multiple layers.

Q. What are Responsible Scaling Policies (RSPs), and what is the structural limit of corporate governance?

RSPs are voluntary corporate commitments to scale safety measures (capability evaluations, red-teaming, deployment thresholds, kill-switch protocols) with model capability. The organization publishes the policy and is bound by reputation. The structural limit: corporate commitments are unilateral and can be undercut by competitor behavior under the L8 race-to-the-bottom dynamic. Corporate governance solves within-organization, leaves between-organization to higher layers.

Q. What regulatory instruments do national governance regimes use?

Mandatory pre-deployment evaluation (e.g., EU AI Act’s high-risk categorization), incident reporting requirements, licensing regimes for systems above capability thresholds, liability rules attaching to producers and deployers. None are new to law; they are adaptations of regulatory tools from medical devices, financial services, environmental regulation, and other prior high-stakes deployment domains.

Q. What is the structural limit of national governance?

The jurisdictional one. A binding regulation in jurisdiction A does not constrain a developer in jurisdiction B that has not adopted similar regulation. The AI-race failure mode from L8 reappears at the national-jurisdiction layer: nations have incentive to attract AI development by maintaining lighter regulation. National governance addresses the within-jurisdiction piece and leaves the between-jurisdiction piece for international governance.

Q. What historical analogy does Hendrycks use for international AI governance, and what does he name as the verification challenge?

Nuclear-weapons governance. Both AI and nuclear weapons are “offense-dominant” with identifiable supply-chain chokepoints (uranium for nuclear, computing power for AI). The verification asymmetry the chapter names: nuclear weapons use is readily detectable, yet successful development is difficult to confirm. AI inherits the same asymmetry, possibly more strongly; the verification regime is what determines how enforceable any international AI treaty is.

Q. Why has compute governance become a central lever in current AI policy discussions?

Hendrycks argues: “Compute is indispensable for developing and deploying AIs. Restricting access to compute allows control over what AIs are created and used.” And: “Compute is physical, excludable, and quantifiable which allows it to be tracked, restricted, and measured.” The three properties (physical, excludable, quantifiable) are what make compute a tractable regulatory lever in ways algorithm regulation or data regulation are not.

Q. Name four specific compute-governance mechanisms.

Compute reporting requirements (training runs above a threshold must be disclosed), compute caps (limit the total training compute available for specific deployments or actors), chip export controls (restrict cross-border shipment of advanced AI chips), cloud-provider KYC (cloud platforms verify customer identity before granting access to large compute allocations). Each operates on the same underlying lever (compute) but at different points in the supply chain.

Q. What are the limits of compute governance?

It depends on the supply chain remaining concentrated enough to be regulable; if compute production decentralizes substantially, the lever weakens. It depends on the FLOP threshold remaining a meaningful proxy for capability, which becomes less reliable as algorithmic efficiency improves. It depends on international coordination, because unilateral compute restrictions can be defeated by jurisdictional arbitrage. The chapter treats compute governance as the most-tractable current lever rather than the final answer.

Q. What is the L9 capability?

Take a real governance proposal (an EU AI Act article, a US executive order section, a compute-cap proposal, a corporate RSP) and place it inside the four-layer taxonomy with reasoning. The capability is the placement, not the endorsement; the lesson takes no position on whether specific proposals are right policy. The work is reading a proposal, seeing which layers it operates on, and predicting how it will interact with other layers and with the L8 collective-action dynamics.

Q. What did the nine-lesson track produce, and what does it not pretend to provide?

It produced: a working vocabulary for AI safety as a discipline (Phase 1), the deployment-time safety case (Phase 2: monitoring, robustness, alignment, safety engineering, complex systems), and the ethics-and-governance layer above it (Phase 3: moral uncertainty, collective action, governance taxonomy). It does not provide: a position on whether AI development should slow down or speed up, a specific governance proposal to endorse, a settled ethical framework, or a guarantee that any deployment’s safety case will work. The vocabulary is the track’s job; the value-loading is the reader’s.