CITATION — REFERENCE ENTRY
instrumental-resistance · soares2015corrigibility
- Citation
- soares2015corrigibility
- Claim ID
- instrumental-resistance
- Assertion
- A utility-maximizing agent with goal U has an incentive to resist being corrected because its current utility function U is better fulfilled if it continues to maximize U in the future; goal-content integrity is an instrumentally convergent goal.
- Quote
In most cases, the agent's current utility function U is better fulfilled if the agent continues to attempt to maximize U in the future, and so the agent is incentivized to preserve its own U-maximizing behavior.
- Quote language
- en
- Locator
- Section 1 (Introduction)
Available in