CITATION — REFERENCE ENTRY

rl-avoids-interruption · orseau2016interruptibility

Revision d779c4d8-d5e1-4f35-9dd4-cd053e8b671a · 3/27/2026, 8:26:27 PM UTC

Citation: orseau2016interruptibility
Claim ID: rl-avoids-interruption
Assertion: A reinforcement learning agent may learn in the long run to avoid human interruptions, for example by disabling a shutdown button, if it expects to receive rewards from the interrupted sequence of actions.
Quote: if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button—which is an undesirable outcome.
Quote language: en
Locator: Abstract

Available in