Study finds heavy AI writing use makes essays more neutral and less personal, researchers test Claude GPT and Gemini on 100 participants, bland professional prose emerges as the default output of shared tools

Images

Photo illustration of man's face with emoji of two arrows overlaid atop it nbcnews.com

A new study suggests that heavy use of large language models does more than polish grammar: it changes what people argue and how they sound while arguing it. According to NBC News, researchers asked 100 participants to write essays responding to the question of whether money leads to happiness, then compared outputs across different levels of AI reliance. Participants who generated more than 40% of their text with an LLM produced essays that were markedly more neutral—answering with a neutral stance 69% more often than those who used AI lightly or not at all.

The shift was not limited to tone. The researchers concluded that heavy AI use pushed essays “away from anything that a human would have ever written,” one author, University of Washington computer science professor Natasha Jaques, told NBC News. The outputs became less personal and more formal, with heavy users producing essays containing 50% fewer pronouns—an indicator of fewer anecdotes and fewer references to lived experience. After the experiment, heavy users also reported that their writing felt less creative and less in their own voice, even though their satisfaction with the final product was similar to that of other participants.

The study examined three widely used systems in 2025—Anthropic’s Claude 3.5 Haiku, OpenAI’s GPT-5 Mini, and Google’s Gemini 2.5 Flash. It also found that many participants resisted full delegation: in initial testing, about half refused to use an LLM at all or used it only to find information rather than generate text. That split matters because it highlights an emerging divide between institutions that can mandate tool usage and individuals who can opt out.

In workplaces and schools, “neutral” writing is often treated as a virtue: it reads as professional, avoids controversy, and is easier to approve. If the same few writing systems become the default drafting layer for emails, reports, grant applications and student essays, then neutrality stops being a stylistic choice and becomes a compliance outcome. The study’s finding that AI-heavy users converge on similar rhetorical positions points to a second-order effect: standardisation is not just about phrasing, but about narrowing the range of acceptable arguments.

Jaques, who is also a senior research scientist at Google DeepMind, said an “ideal” system would write what the human would have written and simply save time. The observed pattern was the opposite: the model produced a different essay, and the user accepted the trade.

The research has been peer-reviewed and accepted to an upcoming workshop at a leading AI conference. Its most concrete result is also its simplest: when participants outsourced a large share of their writing to the same tools, their texts started to sound—and think—alike.