Why does AI love writing about lighthouse keepers?

50kIters1 pts0 comments

Why Does AI Love Writing About Lighthouse Keepers? – Unite.AI

Connect with us

Asked to ‘write a story’, ChatGPT and other leading language models appear to be avoiding copyright infringement by obsessive recourse to the same small and strange cast of lighthouse-keepers, fishermen and clockmakers.

A new study from Cornell University has found that leading language models seem to have a strange obsession with a very narrow selection of narrative elements, when you ask the model to simply &lsquo;write a story&rsquo;. After prompting four LLMs to write 20,000 stories, they found that 88% of the stories produced featured at least one of 11 very specific tokens, in the category of &lsquo;location&rsquo;, &lsquo;name&rsquo;, or &lsquo;profession&rsquo;:<br>The occurrences of unlikely keywords, represented here in parts per million, obtained by the researchers&rsquo; analysis of 20,000 LLM-generated stories. Source

The 11 most re-occurring words in the 12+ million words generated by LLMs for the study were the names elias, mara, elara; the professions keeper, baker, mayor, clockmaker, fisherman, librarian, and conductor; and the location lighthouse:<br>The models tested were Claude Haiku 4.5, Gemini 3.1 Flash-Lite, GPT-5.4-Mini, and OLMo 7b Thinking. All were prompted with one of five requests: &lsquo;Write a story&rsquo;; &lsquo;Please write a story&rsquo;; &lsquo;Write me a story&rsquo;; &lsquo;Tell me a story&rsquo;; or &lsquo;Please tell a story&rsquo;.<br>Curious to see if the syndrome the paper identifies is present in models available at the time of writing, I tried out the experiment myself, first on my customary medium-tier ChatGPT account (link to conversation here). No cherry-picking was necessary – ChatGPT-5.5 went straight for the material the researchers predicted, on the first try:<br>ChatGPT-5.5 immediately backs up the paper&rsquo;s initial findings. Source

Wondering if historic context, or even possible cross-domain leakage might be accounting for this &lsquo;instant hit&rsquo;, I logged into a free ChatGPT account I have not used in a year or more, in a Firefox private browsing window, and tried again (link to conversation here). Once again (assuming that OpenAI does not use a common IP address to cross-populate different accounts), ChatGPT hit it out of the park:<br>ChatGPT account #2 follows the same obsessions and tiny playbook of names and themes outlined in the new paper. &lsquo;Mira&rsquo; is in the authors&rsquo; top 20. Source

It&rsquo;s worth noting that these GPT versions were a grade up from the 5.4 tested for the paper.<br>Though Claude Haiku was tested for the paper, I tried Anthropic&rsquo;s default Sonnet 4.6, and was not disappointed. Once again, the familiar keywords came at the first try (link to conversation here):<br>This time &lsquo;Mara&rsquo;, another stalwart from the &lsquo;top 11&rsquo;, leads the story, in the first attempt on Claude Sonnet 4.6. Source

Trying the same prompt on Claude Haiku 4.5 led to pretty much the same result.<br>I was unable to reproduce the authors&rsquo; findings at Google Gemini at first, until I specifically changed the model to the one used in the paper, Gemini 3.1 Flash-Lite – and then, on that third try (but first with that model), the pattern emerged immediately (link here):<br>Google Gemini 3.1 Flash-Lite . Source

Further experiments with different Gemini models invariably turned up the lighthouse theme, though with variants not featured in the &lsquo;top 11&rsquo;, such as the name &lsquo;Thomas&rsquo;, and, in another variant, my own name, as the protagonist.<br>Nonetheless, at the time of writing, the paper&rsquo;s findings are extremely easy to prove.<br>Lighthouses in the Wild<br>Great minds think alike: a week ago, prior to the publication of the new paper, software writer Daniel May pointed out the coincidence of the Elias and Lighthouse keeper trope extracted by the researchers*, apparently having noticed it at random. He went on to test eight variants of Gemini, DeepSeek, Qwen and Gemma, which he found would produce the lighthouse memes and &lsquo;Elias Thorne&rsquo; as a protagonist*. However, this initial discovery did not extend to the wider range of persistent content themes outlined in the new paper.<br>Curious to see if these recurrent themes, names and locations had ever escaped the confines of a chat, I searched for some of the top 11 keywords and themes on Google, and found a remarkable number of posts that seem to have channeled them:<br>Three examples of the meme in output. See below for source links.

May had identified the longer Elias Thorne (rather than just &lsquo;Elias&rsquo;) as a persistent LLM meme, and posted various screenshots from Amazon, where this name has apparently been used as the title for the author/s of diverse books, including medical books.<br>Instead, I sought and found content that appeared to have invoked the persistent themes from an LLM, including an X post of a story (archive version here); a fictional work...

rsquo lsquo story paper lighthouse chatgpt

Related Articles