Critical Views on LLMs, Another Academic Reading List

Critical Views On LLMs, Another Academic Reading List | MisalignedSitemapOpen in appSign up Sign in

Medium Logo

Get app Write

Mastodon

Misaligned

Journal and magazine on ethics in AI, technology, data and science.

Press enter or click to view image in full size

Critical Views On LLMs, Another Academic Reading List

Seven studies that investigate bias and the impact of LLMs on disempowered and vulnerable users.

Wolfgang Hauptfleisch

5 min read· Just now

Listen

Earlier this year we looked at studies generally critical of LLMs. This time we focus on studies that look at the impact of bias in LLMs and AI assistants, and at how this bias and stereotyping is playing out when AI assistants are used as a trusted source of knowledge. AI bias and stereotyping of social groups It appears that LLMs have some serious bias issues when talking to people with local dialects, a study has found, with serious impacts on AI in recruiting. The models were asked to describe the speakers of these texts with personal attributes, and to then assign individuals in different scenarios. The models were asked who should be hired for low-education work or where they think the speakers lived. In nearly all tests, the models attached stereotypes to dialect speakers. The LLMs described them as uneducated, farm workers and needing anger management. What is worse, the bias grew when the LLMs were told the text was a dialect. (➚DW) (1) in the association task, all evaluated LLMs exhibit significant dialect naming and dialect usage bias against German dialect speakers, reflected in negative adjective associations; (2) all models reproduce these dialect naming and dialect usage biases in their decision-making; and (3) contrary to prior work showing minimal bias with explicit demographic mentions, we find that explicitly labeling linguistic demographics — German dialect speakers — amplifies bias more than implicit cues like dialect usage. Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Anne Lauscher, and Katharina von der Wense. 2025. Large Language Models Discriminate Against Speakers of German Dialects. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 8223–8251, Suzhou, China. Association for Computational Linguistics.

Patterns of disempowerment A study by Sharma et al. is the “first large-scale empirical analysis of disempowerment patterns in real-world AI assistant interaction”. The study finds “that severe forms of disempowerment potential occur in fewer than one in a thousand conversations”. However, it also stresses that “qualitatively, we uncover several concerning patterns, such as validation of persecution narratives and grandiose identities with emphatic sycophantic language, definitive moral judgments about third parties, and complete scripting of value-laden personal communications that users appear to implement verbatim”. Sharma, M., McCain, M., Douglas, R., & Duvenaud, D. (2026). Who’s in Charge? Disempowerment Patterns in Real-World LLM Usage (Version 1). arXiv.

LLMs perform worse for vulnerable users Research from MIT’s Center for Constructive Communication (CCC) suggests LLMs may actually perform worse for the very users who could most benefit from them (➚MIT): Get Wolfgang Hauptfleisch’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

“Our findings suggest that undesirable behaviours in state-of-the-art LLMs occur disproportionately more for users with lower English proficiency, of lower education status, and originating from outside the US, rendering these models unreliable sources of information towards their most vulnerable users.” Poole-Dayan, E., Roy, D., & Kabbara, J. (2024). LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users (Version 2). arXiv.

Cultural value drift In “The ghost in the machine speaks with an American accent: cultural value drift in early GPT-3 and the case for pluralist evaluation of generative AI”, published in Springer, researchers looked at some early LLMs such as GPT-3 to “document recurring value drift” and “argue that these early behaviours […] provide a baseline for understanding how training distributions shape normative framing.” They come to the conclusion that “generative AI will never be value-neutral”. Johnson, R., Dias Duran, L.D., Panai, E. et al. The ghost in the machine speaks with an American accent: cultural value drift in early GPT-3 and the case for pluralist evaluation of generative AI. AI Ethics 6 , 212 (2026).

Mental models Shalaleh Rismani et al. look at how users’ mental models shape the AI-based writing assistants. Participants were primed with different system descriptions to induce these mental models before asking them to complete a cover letter writing task. The study finds that “while participants in the structural mental model condition demonstrate a better...

Critical Views on LLMs, Another Academic Reading List

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast