Bonum Certa Men Certa

The LLM Ouroboros Phenomenon

posted by Roy Schestowitz on May 19, 2025,
updated May 19, 2025

An ouroboros in a 1478 drawing in an alchemical tract

Ancient Greek mythology came up with this concept of an ouroboros, wherein some animal - typically a snake for a feasible "IRL" (in real life) metaphor - eats itself by eating its own tail. We would not be the first to point out the analogy here for LLMs because an ouroboros is a good parable. This morning we catalogued two BSD and Linux sites complaining about desperate LLM scrapers staging a DDoS attack in pursuit of original, as in human-written, code or words. This isn't a new problem for us and in the past few days we served about half a million pages in Gemini Protocol, likely due to to LLM scrapers. It's obnoxious to say the least, but distinguishing benign from malicious (or worthless junk) requests is hard and a "moving target" (it's never enough as parasites learn to adapt).

This morning in IRC we made an assertion about LLMs and fake (slop) images. We also made several observations. Fact #1: over time slop gets worse (training set is like some blurry JPEG). Fact #2: People's "smell" for slop improves over time, as they 'train' on slop and can detect it based on prior encounters. Put 1 and 2 together.

Are LLMs bound to not only get worse but also more easily detectable by an increasingly sceptical general public? TheLayoff.com has just responded to this.

An associate opines that fact #1 (that slop gets worse over time) is exacerbated by the flood of slop on the Net being snarfed up by newer bots and mistaken for training data. "Thus the feedback loop I mentioned a long time back and which Andy wrote about in depth." (He was referring to Dr. Farnell's good writings about this dilemma - as he did several times in The CyberShow's blog)

To a certain extent my Ph.D. thesis (dissertation) covered this about two decades ago. The associate says that it's a "well-known problem from days of old".

There are several unique aspects to this, including validation bias. To me it seemed a bit related to but not the same as over-training because, as an associate explains, "overtraining is something else: too much data and the patterns become locked too tightly to the training set and less useful for new data".

For an LLM to scan online its own output serves to affirm the mistakes, or the errors, often euphemised as mere "hallucinations", which are innocent, not libellous, and by no means "intentional" and "harmful". Dr. Farnell and Dr. Kate Brown responded to this last October in "Radical disbelief and its causes".

In the context of my thesis (dissertation), a concern was raised about what we back then called "synthetic data" finding its way "back" into the training set. So when you check brain MRI scans (which is what we did back then) you must ensure you only ever deal with real data, not mock or manipulated data that can confirm your own biases and "fit into" the model that generated it in the first place (in generative mode). To use the analogy of text-based LLMs, your BS is "truth" if your input is your own BS (output/s) and it would be deemed accurate, based on you (opposite of the notion of peer review in science). The associate correctly points out, based on a scan of my thesis (dissertation), that the strings "overtraining" and "over-training" are not in the dissertation, but we used different terms back then.

A squat toilet (also known as an Eastern, Turkish, Iranian or Natural-Position toilet). This one is in Turkey

"An LLM Ouroboros of shit", as the associate dubs it, would be statistical models (such as PDMs or AAMs*) treating computer-generated images as something from "the real world".

The so-called "generative hey hi" (genAI) "bros" won't allow the media to talk about such issues, at least if they can downplay the issues and deny/misportray them (in the media). But it's a real and growing problem. Its magnitude likely grows quadratically, not linearly. Just like other bubbles (overabundance based around hype), don't expect linear implosions. When it's gone (poof!), it's gone.

____

* PDM and AAM need expansion in the explanatory sense, not just words (in the acronyms). PDMs go back several decades ago they were invented or pioneered by the people who tutored me. They use mathematical, statistical models to perform multidimensional analysis of data variations, based upon principal component analysis (PCA). AAMs are an extension but with textures, not only points. This is really old stuff; even AAMs are over 23 years old; now the mainstream media pretends those are some kind of "revolution".

Other Recent Techrights' Posts

European Patent Office (EPO) Series: The Centre (in Portugal) Falls Apart…
Luís Montenegro became embroiled in a conflict-of-interest controversy
Links 10/06/2026: More Microsoft Layoffs, Sweden to "Ban Mobile Phones in Schools"
Links for the day
 
SLAPP Censorship - Part 103 Out of 200: Telling People What They Know and Don't Know About Death Threats They Receive
patronising letters sent on behalf of the Serial Strangler from Microsoft
IBM Genies in the Bottle
for ordinary people working who at at IBM, it's not hard to see that IBM is floundering
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Wednesday, June 10, 2026
IRC logs for Wednesday, June 10, 2026
Links 11/06/2026: LF Openwashing of Slop and "Azerbaijan Bans TikTok and Other Social Media Apps in School"
Links for the day
IBM Lost About 18% of Its "Market Value" This Month
In IBM's case, a lot of the latest "pump" was Arvind's "quantum" hype/fantasy
Gemini Links 10/06/2026: Signal to Noise, Cancer, and Permacomputing
Links for the day
Communities and "Prosumers."
today's meetup will be about community
Gemini and Gopher Links 10/06/2026: Roasting, Changes, and Harms of Slop
Links for the day
Microsoft Azure Shrinking With More Mass Layoffs
"Reports suggest the layoffs will impact close to 200 out of 400 workers, who are set to cease employment at Azure on July 6"
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, June 09, 2026
IRC logs for Tuesday, June 09, 2026
European Patent Office (EPO) Series: The Centre-Right "Social Democratic Party" in Portugal
Quite an achievement for a former Maoist radical and aspiring champion of the Portuguese proletariat to be invited to join Goldman Sachs
SLAPP Censorship - Part 102 Out of 200: Maybe One Day Whistleblowers From Brett Wilson LLP Will Tell Us What Really Happened
Maybe one day some former staff of Brett Wilson LLP will also approach us to blow the whistle
What LibreOffice and TDF Get Right About Document Formats (and What They Get Wrong)
OOXML is a phantom - it is something nobody implements, not even Microsoft!
Gemini Links 09/06/2026: "The Mist of the Lands Between", Board Game Concept
Links for the day
2026: The Year Slop Companies "Made an Exit" (Threw in the Towel Over to Wall Street)
Remember 2026 as the year two major slop companies (which we won't name) sought an IPO
Links 09/06/2026: NSO Group still cracking, "FOI tribunal throws out £14k costs claim against journalist Barnie Choudhury"
Links for the day
Links 09/06/2026: "Smartphones Broke Dating" and "EU Open Source Strategy"
Links for the day
Cannot Speak About IBM Wrongdoing or Jobs Being Sent Overseas (Lower Salaries)
IBM has long attacked the media, the whistleblowers, and even online forums
European Patent Office (EPO) Series: The CIA-Funded Centre-Left in Portugal
In the political turmoil which followed the fall of the old regime, the communists seemed to be acquiring a dominant position and there was a very real risk that Portugal could end up aligned with the Eastern Bloc if they were not stopped
This Coming Friday
Richard Stallman (RMS)
Yesterday Afternoon The Register MS Published a Fake Article That Says "AI" 31 Times Because It Got Paid to Do This
What will happen when all those loans for slop (Ponzi scheme) stop and companies' marketing budgets - which include media bribes for hype campaigns - are no more?
Extraordinary General Meeting of Staff Union of the European Patent Office Ahead of Intensifying Strikes
We will, in the meantime, run a series about EPO corruption, which is now connected to corruption in Portugal and to corruption inside the EU
Several Slopfarms That Target "Linux" Seem to Have Died
Or perished severely
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, June 08, 2026
IRC logs for Monday, June 08, 2026
Gemini Links 09/06/2026: Tanana River, Cassette Beasts, and Emacs
Links for the day