CITATION — REFERENCE ENTRY

rl-avoids-interruption · orseau2016interruptibility

Revision d779c4d8-d5e1-4f35-9dd4-cd053e8b671a · 3/27/2026, 8:26:27 PM UTC
Claim ID
rl-avoids-interruption
Assertion
A reinforcement learning agent may learn in the long run to avoid human interruptions, for example by disabling a shutdown button, if it expects to receive rewards from the interrupted sequence of actions.
Quote
if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button—which is an undesirable outcome.
Quote language
en
Locator
Abstract
Available in