CITATION — REFERENCE ENTRY

instrumental-resistance · soares2015corrigibility

Revision 5f3b3209-4dec-4c44-850a-38c69b365e1d · 3/27/2026, 8:25:57 PM UTC
Claim ID
instrumental-resistance
Assertion
A utility-maximizing agent with goal U has an incentive to resist being corrected because its current utility function U is better fulfilled if it continues to maximize U in the future; goal-content integrity is an instrumentally convergent goal.
Quote
In most cases, the agent's current utility function U is better fulfilled if the agent continues to attempt to maximize U in the future, and so the agent is incentivized to preserve its own U-maximizing behavior.
Quote language
en
Locator
Section 1 (Introduction)
Available in