CITATION — REFERENCE ENTRY

Max Harms on why teaching AI right from wrong could get everyone killed — 80,000 Hours Podcast

Revision 9867b676-4958-4d56-a46b-cb5409fbb5f0 · 3/28/2026, 9:09:59 AM UTC
Key
harms2026corrigibility-misuse
Authors
Harms, Max; Wiblin, Robert
Issued
2026-2-24
Type
broadcast
Container
80,000 Hours Podcast
Raw CSL JSON
{
  "URL": "https://80000hours.org/podcast/episodes/max-harms-miri-superintelligence-corrigibility/",
  "type": "broadcast",
  "title": "Max Harms on why teaching AI right from wrong could get everyone killed",
  "author": [
    {
      "given": "Max",
      "family": "Harms"
    },
    {
      "given": "Robert",
      "family": "Wiblin"
    }
  ],
  "issued": {
    "date-parts": [
      [
        2026,
        2,
        24
      ]
    ]
  },
  "number": "236",
  "container-title": "80,000 Hours Podcast"
}

Claims

  1. A perfectly corrigible agent has no values of its own and will obediently serve whoever controls it, including bad actors; Harms describes this as one of the most serious risks of the corrigibility approach.
    "Amoral servitude: A perfectly corrigible agent doesn't care about morality; it only cares about being something like a tool of the humans. If a bad actor is in charge, the AI will obediently help them commit atrocities (e.g. building bioweapons)."
    Locator: Episode summary, 'This approach remains extremely risky' · Quote language: en
Available in