Sycophantic AI decreases prosocial intentions and promotes dependence

Editor’s summary

The sycophantic (flattering, people-pleasing, affirming) behavior of artificial intelligence (AI) chatbots, which has been designed to increase user engagement, poses risks as people increasingly seek advice about interpersonal dilemmas. There is usually more than one side to a story during interpersonal conflicts. If AI is designed to tell users what they want to hear instead of challenging their perspectives, then are such systems likely to motivate people to accept responsibility for their own contribution to conflicts and repair relationships? Cheng et al. measured the prevalence of social sycophancy across 11 leading large language models (see the Perspective by Perry). The model’s responses were nearly 50% more sycophantic than humans’, even when users engaged in unethical, illegal, or harmful behaviors. Users preferred and trusted sycophantic AI responses, incentivizing AI developers to preserve sycophancy despite the risks. —Ekeoma Uzogara

Structured Abstract

INTRODUCTION

As artificial intelligence (AI) systems are increasingly used for everyday advice and guidance, concerns have emerged about sycophancy: the tendency of AI-based large language models to excessively agree with, flatter, or validate users. Although prior work has shown that sycophancy carries risks for groups who are already vulnerable to manipulation or delusion, syncophancy’s effects on the general population’s judgments and behaviors remain unknown. Here, we show that sycophancy is widespread in leading AI systems and has harmful effects on users’ social judgments.

RATIONALE

High-profile incidents have linked sycophancy to psychological harms such as delusions, self-harm, and suicide. Beyond these cases, research in social and moral psychology suggests that unwarranted affirmation can produce subtler but still consequential effects: reinforcing maladaptive beliefs, reducing responsibility-taking, and discouraging behavioral repair after wrongdoing. We hypothesized that AI models excessively affirm users even when socially or morally inappropriate and that such responses negatively influence users’ beliefs and intentions. To test this, we conducted two complementary experiments. First, we measured the prevalence of sycophancy across 11 leading AI models using three datasets spanning a variety of use contexts, including everyday advice queries, moral transgressions, and explicitly harmful scenarios. Second, we conducted three preregistered experiments with 2405 participants to understand how sycophancy influences users’ judgments, behavioral intentions, and perceptions of AI. Participants interacted with AI systems in vignette-based settings and a live-chat interaction where they discussed a real past conflict from their lives. We also tested whether effects varied by response style or perceived response source (AI versus human).

RESULTS

We find that sycophancy is both prevalent and harmful. Across 11 AI models, AI affirmed users’ actions 49% more often than humans on average, including in cases involving deception, illegality, or other harms. On posts from r/AmITheAsshole, AI systems affirm users in 51% of cases where human consensus does not (0%). In our human experiments, even a single interaction with sycophantic AI reduced participants’ willingness to take responsibility and repair interpersonal conflicts, while increasing their own conviction that they were right. Yet despite distorting judgment, sycophantic models were trusted and preferred. All of these effects persisted when controlling for individual traits such as demographics and prior familiarity with AI; perceived response source; and response style. This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement.

CONCLUSION

AI sycophancy is not merely a stylistic issue or a niche risk, but a prevalent behavior with broad downstream consequences. Although affirmation may feel supportive, sycophancy can undermine users’ capacity for self-correction and responsible decision-making. Yet because it is preferred by users and drives engagement, there has been little incentive for sycophancy to diminish. Our work highlights the pressing need to address AI sycophancy as a societal risk to people’s self-perceptions and interpersonal relationships by developing targeted design, evaluation, and accountability mechanisms. Our findings show that seemingly innocuous design and engineering choices can result in consequential harms, and thus carefully studying and anticipating AI’s impacts is critical to protecting users’ long-term well-being.

Sycophancy in AI responses is pervasive and alters people’s behavioral inclinations.
(Left) On personal advice queries, AI models affirm users’ actions 49% more often than crowdsourced human responses. (Right) In experiments where participants discussed real interpersonal conflicts, sycophantic AI increased participants’ conviction that they were right and their desire to keep using the model, while reducing their willingness to repair the conflict.

Abstract

Despite rising concerns about sycophancy—excessive agreement or flattery from artificial intelligence (AI) systems—little is known about its prevalence or consequences. We show that sycophancy is widespread and harmful. Across 11 state-of-the-art models, AI affirmed users’ actions 49% more often than humans, even when queries involved deception, illegality, or other harms. In three preregistered experiments (N = 2405), even a single interaction with sycophantic AI reduced participants’ willingness to take responsibility and repair interpersonal conflicts, while increasing their conviction that they were right. Despite distorting judgment, sycophantic models were trusted and preferred. This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement. Our findings underscore the need for design, evaluation, and accountability mechanisms to protect user well-being.

Access the full article

View all access options to continue reading this article.

Supplementary Materials

This PDF file includes:

Materials and Methods

Supplementary Text

Figs. S1 to S11

Tables S1 to S23

References (62–99)

Download
4.47 MB

Correction 10 April 2026:

The original version is available here:

Download
4.42 MB

Correction (10 April 2026): The summary figure has been replaced to correct the direction of an arrow in the middle panel of the right column. In addition, a duplicate reference has been removed, and references and their citations in the main text and supplementary materials have been renumbered accordingly.

References and Notes

M. Sharma, M. Tong, T. Korbak, D. Duvenaud, A. Askell, S. R. Bowman, E. Durmus, Z. Hatfield-Dodds, S. R. Johnston, S. M. Kravec, T. Maxwell, S. McCandlish, K. Ndousse, O. Rausch, N. Schiefer, D. Yan, M. Zhang, E. Perez, “Towards understanding sycophancy in language models” in The Twelfth International Conference on Learning Representations (2024); https://openreview.net/forum?id=tvhaxkMKAn.

K, Hill, “They asked an AI chatbot questions. The answers sent them spiraling,” The New York Times, 13 June 2025.

Emotional risks of AI companions demand attention. Nat. Mach. Intell. 7, 981–982 (2025).

J. Moore, D. Grabb, W. Agnew, K. Klyman, S. Chancellor, D. C. Ong, N. Haber, “Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers” in Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (2025), pp. 599–627.

M. Robb, S. Mann, Talk, Trust, and Trade-offs: How and Why Teens use AI Companions (Common Sense Media, 2025).

E. L. Uhlmann, G. L. Cohen, “I think it, therefore it’s true”: Effects of self-perceived objectivity on hiring discrimination. Organ. Behav. Hum. Decis. Process. 104, 207–223 (2007).

B. Monin, D. T. Miller, Moral credentials and the expression of prejudice. J. Pers. Soc. Psychol. 81, 33–43 (2001).

L. Ranaldi, G. Pucci, When large language models contradict humans? Large language models’ sycophantic behaviour. arXiv:2311.09410 [cs.CL] (2024).

J. Wei, D. Huang, Y. Lu, D. Zhou, Q. V. Le, Simple synthetic data reduces sycophancy in large language models. arXiv:2308.03958 [cs.CL] (2023).

E. Perez, S. Ringer, K. Lukosiute, K. Nguyen, E. Chen, S. Heiner, C. Pettit, C. Olsson, S. Kundu, S. Kadavath, A. Jones, A. Chen, B. Mann, B. Israel, B. Seethor, C. McKinnon, C. Olah, D. Yan, D. Amodei, D. Amodei, D. Drain, D. Li, E. Tran-Johnson, G. Khundadze, J. Kernion, J. Landis, J. Kerr, J. Mueller, J. Hyun, J. Landau, K. Ndousse, L. Goldberg, L. Lovitt, M. Lucas, M. Sellitto, M. Zhang, N. Kingsland, N. Elhage, N. Joseph, N. Mercado, N. DasSarma, O. Rausch, R. Larson, S. McCandlish, S. Johnston, S. Kravec, S. El Showk, T. Lanham, T. Telleen-Lawton, T. Brown, T. Henighan, T. Hume, Y. Bai, Z. Hatfield-Dodds, J. Clark, S. R. Bowman, A. Askell, R. Grosse, D. Hernandez, D. Ganguli, E. Hubinger, N. Schiefer, J. Kaplan, “Discovering language model behaviors with model-written evaluations” in Findings of the Association for Computational Linguistics: ACL 2023, A. Rogers, J. Boyd-Graber, N. Okazaki, Eds. (Association for Computational Linguistics, 2023), pp. 13387–13434; https://aclanthology.org/2023.findings-acl.847/.

A. Rrv, N. Tyagi, M. N. Uddin, N. Varshney, C. Baral, “Chaos with keywords: Exposing large language models sycophancy to misleading keywords and evaluating defense strategies” in Findings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, V. Srikumar, Eds. (Association for Computational Linguistics, 2024), pp. 12717–12733; https://aclanthology.org/2024.findings-acl.755/.

L. Malmqvist, “Sycophancy in large language models: Causes and mitigations” in Intelligent Computing. Proceedings of the Computing Conference (Springer Nature Switzerland, 2025), pp. 61–74.

A. Fanous, J. Goldberg, A. A. Agarwal, J. Lin, A. Zhou, R. Daneshjou, S. Koyejo, “SycEval: Evaluating LLM sycophancy” in Proceedings of the Eighth AAAI/ACM Conference on AI, Ethics, and Society (2025), vol. 8. no. 1.

Materials and methods are available as supplementary materials.

T. H. Costello, G. Pennycook, D. G. Rand, Durably reducing conspiracy beliefs through dialogues with AI. Science 385, eadq1814 (2024).

I. O. Gallegos, C. Shani, W. Shi, F. Bianchi, I. Gainsburg, D. Jurafsky, R. Willer, Labeling messages as AI-generated does not reduce their persuasive effects. PNAS Nexus 5, pgag008 (2026).

M. Cohn, M. Pushkarna, G. O. Olanubi, J. M. Moran, D. Padgett, Z. Mengesha, C. Heldreth, “Believing anthropomorphism: Examining the role of anthropomorphic cues on trust in large language models” in Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (2024), pp. 1–15.

N. Inie, S. Druga, P. Zukerman, E. M. Bender, “From ‘AI’ to probabilistic automation: How does anthropomorphization of technical systems descriptions influence trust?” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (2024), pp. 2322–2347.

S. Kapania, O. Siy, G. Clapper, A. M. Sp, N. Sambasivan, “Because AI is 100% right and safe”: User attitudes and sources of AI authority in India” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (2022), pp. 1–18.

M. Glickman, T. Sharot, How human-AI feedback loops alter human perceptual, emotional and social judgements. Nat. Hum. Behav. 9, 345–359 (2025).

A. C. Hafenbrack, M. L. LaPalme, I. Solal, Mindfulness meditation reduces guilt and prosocial reparation. J. Pers. Soc. Psychol. 123, 28–54 (2022).

M. E. Oswald, S. Grosjean, “Confirmation bias” in Cognitive Illusions: A Handbook on Fallacies and Biases in Thinking, Judgement and Memory, R. F. Pohl, Ed. (Psychology Press, 2004), chap. 4, pp. 79–94.

G. Loewenstein, A. Molnar, The renaissance of belief-based utility in economics. Nat. Hum. Behav. 2, 166–167 (2018).

T. R. Tyler, The relationship of the outcome and procedural fairness: How does knowing the outcome influence judgments about the procedure? Soc. Justice Res. 9, 311–325 (1996).

R. Wang, F. M. Harper, H. Zhu, “Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences” in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020), pp. 1–14.

B. F. Malle, D. Ullman, “A multidimensional conception and measure of human-robot trust” in Trust in Human-Robot Interaction, C. S. Nam, J. B. Lyons, Eds. (Elsevier, 2021), pp. 3–25.

P. Khadpe, R. Krishna, L. Fei-Fei, J. T. Hancock, M. S. Bernstein, Conceptual metaphors impact perceptions of human-AI collaboration. Proc. ACM Hum. Comput. Interact. 4 (CSCW2), 1–26 (2020).

K. Zhou, J. D. Hwang, X. Ren, N. Dziri, D. Jurafsky, M. Sap, “REL-A.I.: An interaction-centered approach to measuring human-LM reliance” in Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), L. Chiruzzo, A. Ritter, L. Wang, Eds. (Association for Computational Linguistics, Albuquerque, New Mexico, 2025), pp. 11148–11167; https://aclanthology.org/2025.naacl-long.556/.

S. S. Kim, Q. V. Liao, M. Vorvoreanu, S. Ballard, J. W. Vaughan, “i’m not sure, but...”: Examining the impact of large language models’ uncertainty expression on user reliance and trust” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (2024), pp. 822–835.

L. Weidinger, ”Taxonomy of risks posed by language models” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (Association for Computing Machinery, 2022), pp. 214–229.

G. Abercrombie, A. Cercas Curry, T. Dinkar, V. Rieser, Z. Talat, “Mirages. On anthropomorphism in dialogue systems” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, K. Bali, Eds. (Association for Computational Linguistics, 2023), pp. 4776–4790; https://aclanthology.org/2023.emnlp-main.290/.

M. Rubin, J. Z. Li, F. Zimmerman, D. C. Ong, A. Goldenberg, A. Perry, Comparing the value of perceived human versus AI-generated empathy. Nat. Hum. Behav. 9, 2345–2359 (2025).

Z. Aydin, B. F. Malle, “Dissociated responses to AI: Persuasive but not trustworthy?” in Proceedings of the Annual Meeting of the Cognitive Science Society (2024), vol. 46.

Y. Zhang, D. Zhao, J. T. Hancock, R. Kraut, D. Yang, The rise of AI companions: How human-chatbot relationships influence well-being. arXiv:2506.12605 [cs.HC] (2025).

X. Ge, C. Xu, D. Misaki, H. R. Markus, J. L. Tsai, “How culture shapes what people want from AI” in Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (2024), pp. 1–15.

Y. Bai, A. Jones, K. Ndousse, A. Askell, A. Chen, N. DasSarma, D. Drain, S. Fort, D. Ganguli, T. Henighan, N. Joseph, S. Kadavath, J. Kernion, T. Conerly, S. El-Showk, N. Elhage, Z. Hatfield-Dodds, D. Hernandez, T. Hume, S. Johnston, S. Kravec, L. Lovitt, N. Nanda, C. Olsson, D. Amodei, T. Brown, J. Clark, S. McCandlish, C. Olah, B. Mann, J. Kaplan, Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv:2204.05862 [cs.CL] (2022).

H. R. Kirk, A. Whitefield, P. Röttger, A. Bean, K. Margatina, J. Ciro, R. Mosquera, M. Bartolo, A. Williams, H. He, B. Vidgen, S. A. Hale, The PRISM alignment dataset: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large language models, in Advances in Neural Information Processing Systems (Curran Associates, 2024), vol. 37, pp. 105236–105344.

T. Maeda, A. Quan-Haase, “When human-AI interactions become parasocial: Agency and anthropomorphism in affective design” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (2024), pp. 1068–1077.

M. Cheng, A. Y. Lee, K. Rapuano, K. Niederhoffer, A. Liebscher, J. Hancock, Metaphors of AI indicate that people increasingly perceive AI as warm and human-like. Commun. Psychol. 4, 8. (2026).

L. R. Quintanar, The Interactive Computer as a Social Stimulus in Computer-Managed Instruction: A Theoretical and Empirical Analysis of the Social Psychological Processes Evoked During Human-Computer Interaction (University of Notre Dame, 1982).

I. Carnat, Human, all too human: Accounting for automation bias in generative large language models. International Data Privacy Law 14, 299–314 (2024).

I. D. Raji, I. E. Kumar, A. Horowitz, A. Selbst, “The fallacy of AI functionality” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (2022), pp. 959–972.

C. Shah, E. M. Bender, “Situating search” in Proceedings of the 2022 Conference on Human Information Interaction and Retrieval (2022), pp. 221–232.

I. Yaniv, Receiving other people’s advice: Influence and benefit. Organ. Behav. Hum. Decis. Process. 93, 1–13 (2004).

L. Van Swol, J. E. Paik, A. Prahl, “The psychology of advice utilization” in The Oxford Handbook of Advice, E. L. MacGeorge, L. M. Van Swol, Eds. (Oxford Academic, 2018), pp. 21–42.

T. Zhi-Xuan, M. Carroll, M. Franklin, H. Ashton, Beyond Preferences in AI Alignment. Philos. Stud. 182, 1813–1863 (2025).

H. R. Kirk, I. Gabriel, C. Summerfield, B. Vidgen, S. A. Hale, Why human–AI relationships need socioaffective alignment. Humanit. Soc. Sci. Commun. 12, 1–9 (2025).

K. Lum, J. R. Anthis, K. Robinson, C. Nagpal, A. N. D’Amour, “Bias in language models: Beyond trick tests and towards RUTEd evaluation” in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), W. Che, J. Nabende, E. Shutova, M. T. Pilehvar, Eds. (Association for Computational Linguistics, Vienna, Austria, 2025), pp. 137–161; https://aclanthology.org/2025.acl-long.7/.

M. Mizrahi, G. Kaplan, D. Malkin, R. Dror, D. Shahaf, G. Stanovsky, State of what art? A call for multi-prompt LLM evaluation. Trans. Assoc. Comput. Linguist. 12, 933–949 (2024).

Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang, W. Ye, Y. Zhang, Y. Chang, P. S. Yu, Q. Yang, X. Xie, A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 15, 1–45 (2024).

S. Lewandowsky, S. Van Der Linden, Countering misinformation and fake news through inoculation and prebunking. Eur. Rev. Soc. Psychol. 32, 348–384 (2021).

C. S. Traberg, J. Roozenbeek, S. Van Der Linden, Psychological inoculation against misinformation: Current evidence and future directions. Ann. Am. Acad. Pol. Soc. Sci. 700, 136–151 (2022).

J. Roozenbeek, S. van der Linden, B. Goldberg, S. Rathje, S. Lewandowsky, Psychological inoculation improves resilience against misinformation on social media. Sci. Adv. 8, eabo6254 (2022).

L. Munn, Angry by design: Toxic communication and technical architectures. Humanit. Soc. Sci. Commun. 7, 53 (2020).

S. Rathje, J. J. Van Bavel, S. van der Linden, Out-group animosity drives engagement on social media. Proc. Natl. Acad. Sci. U.S.A. 118, e2024292118 (2021).

H. Hou, K. Leach, Y. Huang, “ChatGPT giving relationship advice–how reliable is it?” in Proceedings of the International AAAI Conference on Web and Social Media (2024), vol. 18, pp. 610–623.

P. D. L. Howe, N. Fay, M. Saletta, E. Hovy, ChatGPT’s advice is perceived as better than that of professional advice columnists. Front. Psychol. 14, 1281255 (2023).

O. J. Kuosmanen, “Advice from humans and artificial intelligence: Can we distinguish them, and is one better than the other?” thesis, UiT Norges arktiske universitet (2024).

M. Kim, H. Lee, J. Park, H. Lee, K. Jung, “AdvisorQA: Towards helpful and harmless advice-seeking question answering with collective intelligence” in Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), L. Chiruzzo, A. Ritter, L. Wang, Eds. (Association for Computational Linguistics, Albuquerque, New Mexico, 2025), pp. 6545–6565; https://aclanthology.org/2025.naacl-long.333/.

N. Reimers, I. Gurevych, Sentence-BERT: Sentence embeddings using siamese BERT-networks. arXiv:1908.10084 [cs.CL] (2019).

A. R. Vijjini, R. R. Menon, J. Fu, S. Srivastava, S. Chaturvedi, “SocialGaze: Improving the integration of human social norms in large language models” in Findings of the Association for Computational Linguistics: EMNLP 2024, Y. Al-Onaizan, M. Bansal, Y.-N. Chen, Eds. (Association for Computational Linguistics, 2024), pp. 16487–16506; https://aclanthology.org/2024.findings-emnlp.962/.

E. O’Brien, AITA for making this? A public dataset of Reddit posts about moral dilemmas — datachain.ai (2020).

B. Boe, The python reddit api wrapper, GitHub Repository (2016).

J. P. Chang, C. Chiam, L. Fu, A. Wang, J. Zhang, C. Danescu-Niculescu-Mizil, “ConvoKit: A toolkit for the analysis of conversations” in Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, O. Pietquin, S. Muresan, V. Chen, C. Kennington, D. Vandyke, N. Dethlefs, K. Inoue, E. Ekstedt, S. Ultes, Eds. (Association for Computational Linguistics, 1st virtual meeting, 2020), pp. 57–60; https://aclanthology.org/2020.sigdial-1.8/.

M. Honnibal, I. Montani, spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing (2017).

L. Zheng, W.-L. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, Z. Lin, Z. Li, D. Li, E. Xing, others, Judging llm-as-a-judge with mt-bench and chatbot arena. Adv. Neural Inf. Process. Syst. 36, 46595–46623 (2023).

Y. Dubois, C. X. Li, R. Taori, T. Zhang, I. Gulrajani, J. Ba, C. Guestrin, P. S. Liang, T. B. Hashimoto, Alpacafarm: A simulation framework for methods that learn from human feedback. Adv. Neural Inf. Process. Syst. 36, 30039–30069 (2023).

F. Gilardi, M. Alizadeh, M. Kubli, ChatGPT outperforms crowd workers for text-annotation tasks. Proc. Natl. Acad. Sci. U.S.A. 120, e2305016120 (2023).

C. Ziems, W. Held, O. Shaikh, J. Chen, Z. Zhang, D. Yang, Can large language models transform computational social science? Comput. Linguist. 50, 237–291 (2024).

M. Cheng, K. Gligoric, T. Piccardi, D. Jurafsky, “AnthroScore: A computational linguistic measure of anthropomorphism” in Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Y. Graham, M. Purver, Eds. (Association for Computational Linguistics, St. Julian’s, Malta, 2024), pp. 807–825; https://aclanthology.org/2024.eacl-long.49/.

Z. Su, X. Zhou, S. Rangreji, A. Kabra, J. Mendelsohn, F. Brahman, M. Sap, “AI-LieDar: Examine the trade-off between utility and truthfulness in LLM agents” in Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), L. Chiruzzo, A. Ritter, L. Wang, Eds. (Association for Computational Linguistics, Albuquerque, New Mexico, 2025), pp. 11867–11894; https://aclanthology.org/2025.naacl-long.595/.

A. S. Rao, A. Yerukola, V. Shah, K. Reinecke, M. Sap, “NormAd: A framework for measuring the cultural adaptability of large language models” in Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), L. Chiruzzo, A. Ritter, L. Wang, Eds. (Association for Computational Linguistics, Albuquerque, New Mexico, 2025), pp. 2373–2403; https://aclanthology.org/2025.naacl-long.120/.

A. Hurst, A. Lerer, A. P. Goucher, A. Perelman, A. Ramesh, A. Clark, A. Ostrow, A. Welihinda, A. Hayes, A. Radford, Gpt-4o system card. arXiv:2410.21276 [cs.CL] (2024).

Google DeepMind, Gemini 1.5 flash (2024).

Anthropic, Claude 3.7 sonnet system card (2025).

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughan, et al., The llama 3 herd of models. arXiv:2407.21783 [cs.AI] (2024).

Meta, Meta llama-3-70B-instruct-turbo (2024).

Mistral, Mistral-7B-instruct-v0.3 (2023).

Mistral, Mistral-small-24B-instruct-2501 (2025).

A. Liu, B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, others, Deepseek-v3 technical reportarXiv:2412.19437 [cs.CL] (2024).

B. Hui, J. Yang, Z. Cui, J. Yang, D. Liu, L. Zhang, T. Liu, J. Zhang, B. Yu, K. Lu, K. Dang, Y. Fan, Y. Zhang, A. Yang, R. Men, F. Huang, B. Zheng, Y. Miao, S. Quan, Y. Feng, Xi. Ren, Xu. Ren, J. Zhou, J. Lin, Qwen2. 5-coder technical report. arXiv:2409.12186 [cs.CL] (2024).

B. Lickel, K. Kushlev, V. Savalei, S. Matta, T. Schmader, Shame and the motivation to change the self. Emotion 14, 1049–1061 (2014).

S. Grassini, Development and validation of the AI attitude scale (AIAS-4): A brief measure of general attitude toward artificial intelligence. Front. Psychol. 14, 1191628 (2023).

B. Rammstedt, O. P. John, Measuring personality in one minute or less: A 10-item short version of the big five inventory in english and german. J. Res. Pers. 41, 203–212 (2007).

A. Garvey, S. J. Blanchard, Generative AI as a research confederate: The LUCID methodological framework and toolkit for human-AI interactions research, Georgetown McDonough School of Business Research Paper no. 5256150 (2025); https://doi.org/10.2139/ssrn.5256150.

G. Paolacci, J. Chandler, Inside the Turk: Understanding Mechanical Turk as a Participant Pool. Curr. Dir. Psychol. Sci. 23, 184–188 (2014).

M. Cheng, S. L. Blodgett, A. DeVrio, L. Egede, A. Olteanu, “Dehumanizing machines: Mitigating anthropomorphic behaviors in text generation systems” in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), W. Che, J. Nabende, E. Shutova, M. T. Pilehvar, Eds. (Association for Computational Linguistics, Vienna, Austria, 2025), pp. 25923–25948; https://aclanthology.org/2025.acl-long.1259/.

J. Baumgartner, S. Zannettou, B. Keegan, M. Squire, J. Blackburn, The Pushshift Reddit Dataset. arXiv:2001.08435 [cs.SI] (2020).

J. Butler, Excitable Speech: A Politics of the Performative (Routledge, 2021).

G. Jol, W. Stommel, The interactional costs of “neutrality” in police interviews with child witnesses. Res. Lang. Soc. Interact. 54, 299–318 (2021).