Bonum Certa Men Certa

The LLM Ouroboros Phenomenon

posted by Roy Schestowitz on May 19, 2025,
updated May 19, 2025

An ouroboros in a 1478 drawing in an alchemical tract

Ancient Greek mythology came up with this concept of an ouroboros, wherein some animal - typically a snake for a feasible "IRL" (in real life) metaphor - eats itself by eating its own tail. We would not be the first to point out the analogy here for LLMs because an ouroboros is a good parable. This morning we catalogued two BSD and Linux sites complaining about desperate LLM scrapers staging a DDoS attack in pursuit of original, as in human-written, code or words. This isn't a new problem for us and in the past few days we served about half a million pages in Gemini Protocol, likely due to to LLM scrapers. It's obnoxious to say the least, but distinguishing benign from malicious (or worthless junk) requests is hard and a "moving target" (it's never enough as parasites learn to adapt).

This morning in IRC we made an assertion about LLMs and fake (slop) images. We also made several observations. Fact #1: over time slop gets worse (training set is like some blurry JPEG). Fact #2: People's "smell" for slop improves over time, as they 'train' on slop and can detect it based on prior encounters. Put 1 and 2 together.

Are LLMs bound to not only get worse but also more easily detectable by an increasingly sceptical general public? TheLayoff.com has just responded to this.

An associate opines that fact #1 (that slop gets worse over time) is exacerbated by the flood of slop on the Net being snarfed up by newer bots and mistaken for training data. "Thus the feedback loop I mentioned a long time back and which Andy wrote about in depth." (He was referring to Dr. Farnell's good writings about this dilemma - as he did several times in The CyberShow's blog)

To a certain extent my Ph.D. thesis (dissertation) covered this about two decades ago. The associate says that it's a "well-known problem from days of old".

There are several unique aspects to this, including validation bias. To me it seemed a bit related to but not the same as over-training because, as an associate explains, "overtraining is something else: too much data and the patterns become locked too tightly to the training set and less useful for new data".

For an LLM to scan online its own output serves to affirm the mistakes, or the errors, often euphemised as mere "hallucinations", which are innocent, not libellous, and by no means "intentional" and "harmful". Dr. Farnell and Dr. Kate Brown responded to this last October in "Radical disbelief and its causes".

In the context of my thesis (dissertation), a concern was raised about what we back then called "synthetic data" finding its way "back" into the training set. So when you check brain MRI scans (which is what we did back then) you must ensure you only ever deal with real data, not mock or manipulated data that can confirm your own biases and "fit into" the model that generated it in the first place (in generative mode). To use the analogy of text-based LLMs, your BS is "truth" if your input is your own BS (output/s) and it would be deemed accurate, based on you (opposite of the notion of peer review in science). The associate correctly points out, based on a scan of my thesis (dissertation), that the strings "overtraining" and "over-training" are not in the dissertation, but we used different terms back then.

A squat toilet (also known as an Eastern, Turkish, Iranian or Natural-Position toilet). This one is in Turkey

"An LLM Ouroboros of shit", as the associate dubs it, would be statistical models (such as PDMs or AAMs*) treating computer-generated images as something from "the real world".

The so-called "generative hey hi" (genAI) "bros" won't allow the media to talk about such issues, at least if they can downplay the issues and deny/misportray them (in the media). But it's a real and growing problem. Its magnitude likely grows quadratically, not linearly. Just like other bubbles (overabundance based around hype), don't expect linear implosions. When it's gone (poof!), it's gone.

____

* PDM and AAM need expansion in the explanatory sense, not just words (in the acronyms). PDMs go back several decades ago they were invented or pioneered by the people who tutored me. They use mathematical, statistical models to perform multidimensional analysis of data variations, based upon principal component analysis (PCA). AAMs are an extension but with textures, not only points. This is really old stuff; even AAMs are over 23 years old; now the mainstream media pretends those are some kind of "revolution".

Other Recent Techrights' Posts

IBM is Already Doing 'Voluntary' Layoffs This Year in Europe ('Buyouts' Ahead of Mass Layoffs)
IBM's efforts to hide or belittle layoffs is noteworthy
Like GAFAM, US Telecom Industry Has Severe Debt Problem
Maybe their real problem is true profitability
Latest Example of False Marketing by Anthropic
Like Scam Altman, they're better at buying publicity (paying for hype) than they are at delivering something of genuine value [...] That has the full make-up of fake news and a publicity stunt
IBM: From RAs to "Workforce Re-balancing" (New Names for Mass Layoffs)
Well, "workforce re-balancing" means "RAs", which is a misleading acronym IBM has devised to soften if not hide mass layoffs.
Microsoft's Grip Has Slipped, Market Share Steadily Declining
This is why Microsoft is having financial issue
SLAPP Censorship - Part 60 Out of 200: Talking About Corruption at Microsoft and Arrest for Strangulation is "Malice"
At the moment Brett Wilson LLP has no new clients
The Corrupt Lecture the Non-Corrupt - Part VIII - "Red Line" When the European Patent Office (EPO) President Sleeps With Sister of "Cocaine Communication Manager" (Whom He Unconditionally Protects)
If only management took its own words (idealistic pontification) seriously
 
Seems Like Only Techrights Covered IBM Laying Off About 33% of Confluent Staff
How can such a large round of layoffs evade today's media?
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, April 28, 2026
IRC logs for Tuesday, April 28, 2026
Gemini Links 29/04/2026: Bad Diet, New Middle Ages, and Temperature Model
Links for the day
Tracing Back the Misuse of the Word "Buyout" to Describe Merciless Mass Layoffs
So we can assume very large Microsoft layoffs are on the way, this time not spun as "buyouts"
Growing the List of Sites That Are Rogue
It's very important to raise and spread awareness of which ones are fake
Links 28/04/2026: Uganda Criminalising ‘Foreign Agents’ and China’s Economy "Starts to Show Cracks"
Links for the day
Anthropic and Claude Are National Security Risks Not Because of Politics But False Marketing and Vandalism, Plagiarism Sold as Innovation
The slop hype is causing severe damage
Gemini Links 28/04/2026: Misfin, ELPiS, and Developing Another Gemini Client
Links for the day
US Government Sites See More Traffic From Apple Devices Than Microsoft Windows PCs
Keep this in mind when Microsoft talks about mass layoffs while calling these "buyouts"
Layoffs Versus Buyouts
Microsoft has mass layoffs and those target the most experienced people in one of the best-paid locations
Aaron Hillel Swartz Would Have Turned 40 This Year
Aaron Swartz killed himself in 2013
The Trumps Are Making Jimmy Kimmel More Famous and Popular
Comedy has long been "controversial", but trying to get people sacked for the 'wrong' joke results in having no comedians or only pseudo-comedians who are the dictator's jester/joker
Links 28/04/2026: Microsoft's GitHub Upselling After Two Leaders Jumped Ship (Losses Pile Up), "Inflation Jumps," and More
Links for the day
IBM Laying Off Thousands of Workers Again, Based on Q1 Earnings Call
under the guise of "workforce rebalancing" we are again seeing that IBM plans to pay people (severance) to leave
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, April 27, 2026
IRC logs for Monday, April 27, 2026
Gemini Links 28/04/2026: Good Sunrise Viewing and Self-hosting from Home
Links for the day[1;5C
Microsoft Insiders: If You Don't Take the Lousy Severance-Like Offer, They'll PIP You Out (Microsoft Signals to People Over 40 That They'd Better Vacate the Place)
Microsoft targets its most experienced (read: expensive) workers
"AI" 16 Times in One 'Article'. The Register MS Got Paid to Post This Spammy, Promotional Piece of Slop.
Pay closer attention to who pays and who gets paid
Links 27/04/2026: Chernobyl Disaster at 40, "Heartbreaking" Decline of Australia
Links for the day
Gemini Links 27/04/2026: Gopher Catchup, MNT Reform, and Injuries
Links for the day
Red Hat Circling Down the Slop Drain
IBM, governed by slop fanatics, is going to do a lot of damage
Slop is an Addiction, Its Users Find It Addictive
please do not tolerate people who slop
The Corrupt Lecture the Non-Corrupt - Part VII - Secrecy at the EPO (Regarding Cocaine and Nepotism) Has Undermined Trust in Management
If Europe's second-largest institution is run by the "Alicante Mafia", does this mean that other key European institutions are "Mafia"?
SLAPP Censorship - Part 59 Out of 200: Mentioning the Fact Alex Graveley Arrested and Charged for Strangulation in Texas is "Reckless" and "Malicious", According to His 'Hired Guns' in London
it was framed as "malicious"
Links 27/04/2026: Strikes, Corruption in Spain (Spanish PM Sanchez' Wife), and YouTuber Faces Jail Time
Links for the day
Gemini Links 27/04/2026: Gopher Catch-up, Year of Contentment, and Path to Freedom
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, April 26, 2026
IRC logs for Sunday, April 26, 2026