FROM AGPEDIA — AGENCY THROUGH KNOWLEDGE

Generative AI and cognitive offloading

Generative AI and cognitive offloading refers to the use of generative artificial intelligence systems — such as large language models, image generators, and code assistants — as the external resource to which users delegate parts of cognitive tasks such as drafting, summarising, deciding, or learning. It is a subclass of cognitive offloading, the use of external action to reduce the information-processing demand of a task. What distinguishes generative AI from earlier offloading targets such as written notes, calculators, and search engines is that the system itself produces candidate outputs — text, images, code, decisions — for the user to evaluate, edit, or accept, rather than supplying raw information to process internally.^[1]^[2]

The empirical literature on the phenomenon emerged primarily from 2023 to 2025, following the broad public release of large language model assistants. Studies using survey, behavioural, and electrophysiological methods have reported that frequent reliance on such tools is associated with reduced self-reported critical thinking, reduced cognitive engagement during use, and a redistribution of cognitive effort from material production toward verification and oversight.^[3]^[2]^[4] The methodological standards of the field are uneven, and several headline findings have been contested or qualified.^[5]^[6]

Concerns about generative AI extend beyond effects on the individual user. Experimental work in creative writing has reported that AI-generated ideas raise the novelty of individual outputs while making the body of work more homogeneous across writers.^[7] Design and education research has begun to identify interventions that moderate overreliance without removing productivity benefits.^[8]^[1]

Background

The general worry that automating a task degrades the human capacity to perform it predates generative AI by decades. In a 1983 paper in Automatica, the cognitive psychologist Lisanne Bainbridge argued that automating most of a process leaves the human operator with the residual tasks — monitoring, intervening in abnormal conditions, and handling cases the automation cannot — even though these are precisely the tasks for which sustained skill is hardest to maintain without practice.^[9] Bainbridge's analysis, framed for industrial process control, has since been applied to aviation, road transport, and clinical decision support, and is now invoked in discussion of generative AI.^[2]

The cognitive psychology of offloading, consolidated under the term cognitive offloading by Evan Risko and Sam Gilbert in 2016, treats the use of external aids as a metacognitive decision: people offload when they judge that an external resource will outperform their internal capacities, and decisions to offload are sensitive both to confidence in the self and to perceived reliability of the tool.^[10]^[11] An earlier line of empirical work, exemplified by Betsy Sparrow and colleagues' 2011 Science study on the "Google effect", showed that information expected to remain externally available was encoded less deeply than information expected to be lost, suggesting that offloading already reshaped memory before the arrival of generative tools.^[12]

What is distinctive about generative AI as an offloading target is that the system neither merely stores information for later retrieval, nor computes a determinate answer from an explicit input. Large language models produce candidate text, code, summaries, decisions, or images in response to natural-language prompts; the output is contingent on model, prompt, and sampling, and is not in general verifiable against a known correct answer.^[1] Lev Tankelevitch and colleagues argue that this shifts cognitive work toward tasks earlier offloading targets did not require: deciding what to ask, judging when a candidate output is good enough, and assessing one's ability to do that judging.^[1]

Domains

Empirical studies of offloading to generative AI cover a growing range of task domains. The most studied is written work. In a 2024 study in Science Advances, Anil Doshi and Oliver Hauser compared short stories produced by writers given access to one or five generative-AI-suggested plot ideas with those produced by writers working alone; access to AI ideas raised the average creativity, novelty, and enjoyment ratings of stories, with the largest gains concentrated among the least inherently creative writers, but the resulting AI-assisted stories were more similar to each other than the human-only stories were.^[7] The authors describe this as a social dilemma: writers were individually better off but collective novelty was reduced.^[7] A 2025 preprint by Nataliya Kosmyna and colleagues at the MIT Media Lab reported that participants who wrote essays using a large language model produced texts that linguistic-similarity analyses found to be more homogeneous, while also showing weaker brain connectivity on electroencephalography than those writing without the tool or with a search engine.^[4]

Knowledge work was the focus of a survey of 319 information workers conducted by researchers at Carnegie Mellon University and Microsoft Research and presented at CHI 2025. Workers reported less critical-thinking effort on tasks performed with generative AI assistance than on the same tasks performed without it.^[2] The decline was not uniform across the levels of cognitive activity drawn from Bloom's revised taxonomy: respondents reported greater offloading of lower-level activities such as remembering and applying than of higher-level activities such as analysis and evaluation, and the qualitative analysis described a shift in effort away from generating an artefact toward verifying and integrating the artefact produced by the tool.^[2]

Decision-making under AI assistance has been studied separately. In a 2021 study in Proceedings of the ACM on Human-Computer Interaction, Zana Buçinca, Maja Malaya, and Krzysztof Gajos found that participants in two AI-assisted decision tasks accepted the AI's suggestion even when it was wrong, and that adding explanations to the AI's recommendation did not reliably reduce this overreliance.^[8] This pattern — engagement with the tool's output as a heuristic rather than analytical input — extends a longer human factors literature on automation and complacency and has been reported in clinical, financial, and content-moderation contexts.^[8]^[9]

Learning is a cross-cutting domain. The 2025 study by Michael Gerlich found that frequent AI use was associated with lower critical thinking scores, with the association strongest among the youngest participants and attenuated by higher educational attainment.^[3] Educational research has begun to examine how the metacognitive demands of generative AI tools — and the temptation to accept their outputs without verification — interact with subject-matter expertise in students.^[1]^[2]

Mechanisms

Three mechanisms are commonly invoked to explain why offloading to generative AI may differ in its effects from earlier forms of offloading. The first is the high metacognitive demand of generative AI use. Tankelevitch and colleagues argue that effective use of a generative AI system requires the user to formulate a prompt that captures what they want, to evaluate a probabilistic output that may be plausible but wrong, and to decide whether to iterate, accept, or abandon the tool — each of which requires accurate awareness of one's own goals, knowledge, and limits.^[1] Users with poorly calibrated metacognition — those who overestimate their ability to detect errors — are more likely to accept incorrect outputs, while users who recognise their limits may fail to benefit because they avoid the tool altogether.^[1]

The second is an asymmetry between confidence in the tool and confidence in the self. Hao-Ping Lee and colleagues found in their survey of knowledge workers that higher confidence in the generative AI tool predicted less critical thinking when using it, while higher confidence in one's own ability predicted more.^[2] The asymmetry maps onto the metacognitive account of offloading developed by Risko and Gilbert: people offload more readily to a resource they trust, so a tool perceived as authoritative attracts heavier reliance even when its outputs are not in fact more accurate than the user's unaided judgement.^[10]^[2]

The third is effort avoidance. Studies of intention offloading have shown that people set external reminders more often than optimal even when offloading is costly, an effect attributed in part to a preference for avoiding cognitive effort independently of any belief about ability.^[11] Generative AI tools lower the cost of producing a candidate answer to almost any question, creating conditions under which effort avoidance can become routine. Buçinca and colleagues observed this directly: participants with low intrinsic motivation for cognitive engagement — measured by the Need for Cognition scale — benefited least from interventions intended to encourage analytical thinking about AI outputs.^[8]

Empirical findings

Several lines of evidence describe the effects of generative AI use on cognition. The largest survey-based study is that of Gerlich, who collected questionnaire data and follow-up qualitative interviews from 666 participants in the United Kingdom across three age bands, and analysed the data with structural equation modelling.^[3] He reported a significant negative correlation between frequent AI tool usage and a composite critical-thinking measure, with cognitive offloading mediating the relationship; both effects were strongest in the 17-25 age group and attenuated, though not eliminated, by higher educational attainment.^[3] A September 2025 correction noted that one results table in the original paper had been duplicated; the author reported that the substantive conclusions were unaffected.^[6]

The Lee et al. CHI 2025 study used a self-report methodology with knowledge workers who described 936 examples of generative AI use across their work. Workers reported lower enaction of critical thinking on AI-assisted tasks across all levels of Bloom's revised taxonomy, with the largest reductions on lower-order activities such as knowledge recall and application.^[2] The qualitative coding identified a shift in cognitive effort away from material production toward what the authors call information verification, response integration, and task stewardship — checking what the model produced, weaving outputs into a coherent product, and overseeing whether the model is being used appropriately.^[2] The authors frame this not as a reduction in critical thinking but as a redirection of it, while noting that some workers reported losing the ability to do the offloaded steps unaided.^[2]

The Kosmyna et al. 2025 preprint took a smaller-N electrophysiological approach, recording electroencephalography from 54 participants writing essays across three sessions in a large language model, search engine, or brain-only condition, with 18 reassigned in a fourth session.^[4] The authors reported that the LLM condition showed the weakest brain connectivity across the recorded frequency bands, the poorest recall of the participant's own essays, and increased linguistic similarity across essays — a constellation they described as the accumulation of "cognitive debt" with continued LLM use.^[4] A December 2025 commentary by Miloš Stanković and colleagues raised five categories of methodological concern — small sample size, reproducibility of the analyses, methodological issues in the EEG processing, inconsistencies in the reporting, and limited procedural transparency — and recommended more conservative interpretation pending peer review.^[5] As of early 2026 the preprint had not been published in a peer-reviewed venue.^[4]^[5]

The Doshi and Hauser 2024 short-story experiment adds a collective dimension to these individual findings. Stories written by participants given access to AI-generated plot ideas were rated as more creative and better written than those without such access, with the effect concentrated among writers scoring lower on a divergent-association measure of inherent creativity.^[7] Pairwise similarity between stories was significantly higher in the AI-assisted conditions than in the human-only condition, and similarity to the originating AI ideas was higher than to other writers' work, suggesting the homogenisation arose from convergent anchoring on the model's suggestions rather than shared style.^[7]

Design and educational responses

A growing literature has begun to test interventions intended to capture the productivity benefits of generative AI while limiting the cognitive costs. Buçinca, Malaya, and Gajos investigated cognitive forcing functions — interface designs that require the user to engage with a task before being shown the AI's recommendation, such as requiring a preliminary judgement, imposing a brief delay before revealing the AI output, or making the user explicitly request the recommendation.^[8] In two decision-support tasks, all three forcing functions reduced overreliance on incorrect AI suggestions compared with plain explanations of the AI's reasoning, though participants found the forcing conditions more effortful and rated them as less preferred.^[8] The benefits were larger for participants with high Need for Cognition, suggesting that such interventions may help most those already disposed to think analytically.^[8]

Tankelevitch and colleagues argue more broadly that generative AI systems can be designed either to add metacognitive support — by surfacing what the model is uncertain about, by making its reasoning legible, and by helping users formulate clearer prompts — or to reduce the metacognitive demand they impose, for example by tailoring outputs to a user's stated goals and expertise.^[1] The Lee et al. survey identified user-facing features that knowledge workers reported as helpful for sustaining critical thinking, including features that prompt evaluation of outputs, show source citations, and surface limitations and confidence levels.^[2]

Educational responses run in parallel. Gerlich's qualitative interviews found that participants explicitly trained, in formal education or professional contexts, to evaluate the outputs of digital tools showed less of the negative association between AI use and critical thinking than those who had not.^[3] Lee et al. report a similar pattern: workers whose roles required them to take final accountability for an output — engineers, lawyers, clinicians — reported higher critical-thinking engagement with AI assistance than those whose use was more casual, suggesting that institutional context shapes individual offloading practice.^[2]

Analysis: agency, dependency, and design

The empirical record on generative AI and cognitive offloading does not support a confident verdict that the tools are degrading the user's durable capacity to understand a situation, form aims, and act on them. Effects are modest in size, frequently self-reported rather than behaviourally measured, and moderated by individual and institutional factors not yet well understood.^[3]^[2]^[5] The record does not support confident reassurance either. The combination of high metacognitive demand, asymmetric trust, effort avoidance, and convergent anchoring is a plausible mechanism for the kind of gradual deskilling that Bainbridge described in classical automation: capacities that are not exercised tend to attenuate, and capacities exercised only in the service of evaluating an external candidate are not the same as those that produced the candidate in the first place.^[9]^[1]

What is distinctive about the agency question for generative AI is that it concerns not only the individual but the collective. The Doshi and Hauser finding that AI-assisted writers produced more similar work than unaided writers points to a process in which gains in individual quality are paid for in the variety of what is collectively produced — a loss not registered by any individual user but visible in the aggregate.^[7] To the extent that the diversity of independent judgement is itself a condition for collective understanding and action, tools that reduce that diversity reduce a form of collective agency even when individual users are satisfied.^[7]

These considerations bear on design and on the practices that build up around the tools. Cognitive forcing functions, transparent uncertainty indicators, source citation, and structured prompting are not neutral additions; they shift the offloading decision from a one-shot accept-or-reject toward a process in which the user's reasoning is exercised and visible.^[8]^[1] The survey evidence is consistent with the view that such practices help: workers who used generative AI under conditions of accountability and explicit verification reported less of the loss of critical-thinking engagement seen in more casual use.^[2]^[3] None of this resolves whether wide adoption of generative AI tools is, on balance, expanding or contracting the conditions for human cognitive agency. It does suggest that the question is not settled by the tools themselves but by the design choices, institutional rules, and personal habits that surround their use.

^a ^b ^c ^d ^e ^f ^g ^h ^i ^j Tankelevitch, Lev; Kewenig, Viktor; Simkute, Auste; Scott, Ava Elizabeth; et al. (2024-05). The Metacognitive Demands and Opportunities of Generative AI. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. ACM. https://doi.org/10.1145/3613904.3642902 https://advait.org/files/tankelevitch_2024_GenAI_metacognition.pdf.
^a ^b ^c ^d ^e ^f ^g ^h ^i ^j ^k ^l ^m ^n ^o Lee, Hao-Ping; Sarkar, Advait; Tankelevitch, Lev; Drosos, Ian; et al. (2025-04). The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM. https://doi.org/10.1145/3706598.3713778 https://www.microsoft.com/en-us/research/wp-content/uploads/2025/01/lee_2025_ai_critical_thinking_survey.pdf.
^a ^b ^c ^d ^e ^f ^g Gerlich, Michael (2025-01-03). AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking. Societies. https://doi.org/10.3390/soc15010006 https://www.mdpi.com/2075-4698/15/1/6.
^a ^b ^c ^d ^e Kosmyna, Nataliya; Hauptmann, Eugene; Yuan, Ye Tong; Situ, Jessica; et al. (2025-06). Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv. https://doi.org/10.48550/arXiv.2506.08872 https://arxiv.org/abs/2506.08872.
^a ^b ^c ^d Stanković, Miloš; Hirche, Ella; Kollatzsch, Sarah; Doetsch, Julia Nadine (2025-12). Comment on: Your Brain on ChatGPT: Accumulation of Cognitive Debt When Using an AI Assistant for Essay Writing Tasks. arXiv. https://doi.org/10.48550/arXiv.2601.00856 https://arxiv.org/abs/2601.00856.
^a ^b Gerlich, Michael (2025-09-10). Correction: Gerlich, M. AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking. Societies 2025, 15, 6. Societies. https://doi.org/10.3390/soc15090252 https://www.mdpi.com/2075-4698/15/9/252.
^a ^b ^c ^d ^e ^f ^g Doshi, Anil R.; Hauser, Oliver P. (2024-07-12). Generative AI Enhances Individual Creativity but Reduces the Collective Diversity of Novel Content. Science Advances. https://doi.org/10.1126/sciadv.adn5290 https://pmc.ncbi.nlm.nih.gov/articles/PMC11244532/.
^a ^b ^c ^d ^e ^f ^g ^h Buçinca, Zana; Malaya, Maja Barbara; Gajos, Krzysztof Z. (2021-04). To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-Assisted Decision-Making. Proceedings of the ACM on Human-Computer Interaction. https://doi.org/10.1145/3449287 https://www.eecs.harvard.edu/~kgajos/papers/2021/bucinca21trust.pdf.
^a ^b ^c Bainbridge, Lisanne (1983-11). Ironies of Automation. Automatica. https://doi.org/10.1016/0005-1098(83)90046-8 https://ckrybus.com/static/papers/Bainbridge_1983_Automatica.pdf.
^a ^b Risko, Evan F.; Gilbert, Sam J. (2016-09). Cognitive Offloading. Trends in Cognitive Sciences. https://doi.org/10.1016/j.tics.2016.07.002 https://www.sciencedirect.com/science/article/abs/pii/S1364661316300985.
^a ^b Gilbert, Sam J.; Boldt, Annika; Sachdeva, Chhavi; Scarampi, Chiara; et al. (2023-02). Outsourcing Memory to External Tools: A Review of “Intention Offloading.” Psychonomic Bulletin & Review. https://doi.org/10.3758/s13423-022-02139-4 https://pmc.ncbi.nlm.nih.gov/articles/PMC9971128/.
^ Sparrow, Betsy; Liu, Jenny; Wegner, Daniel M. (2011-08-05). Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science. https://doi.org/10.1126/science.1207745 https://dtg.sites.fas.harvard.edu/DANWEGNER/pub/Sparrow%20et%20al.%202011.pdf.

Available in

en - English