Bonum Certa Men Certa

The LLM Ouroboros Phenomenon

posted by Roy Schestowitz on May 19, 2025,
updated May 19, 2025

An ouroboros in a 1478 drawing in an alchemical tract

Ancient Greek mythology came up with this concept of an ouroboros, wherein some animal - typically a snake for a feasible "IRL" (in real life) metaphor - eats itself by eating its own tail. We would not be the first to point out the analogy here for LLMs because an ouroboros is a good parable. This morning we catalogued two BSD and Linux sites complaining about desperate LLM scrapers staging a DDoS attack in pursuit of original, as in human-written, code or words. This isn't a new problem for us and in the past few days we served about half a million pages in Gemini Protocol, likely due to to LLM scrapers. It's obnoxious to say the least, but distinguishing benign from malicious (or worthless junk) requests is hard and a "moving target" (it's never enough as parasites learn to adapt).

This morning in IRC we made an assertion about LLMs and fake (slop) images. We also made several observations. Fact #1: over time slop gets worse (training set is like some blurry JPEG). Fact #2: People's "smell" for slop improves over time, as they 'train' on slop and can detect it based on prior encounters. Put 1 and 2 together.

Are LLMs bound to not only get worse but also more easily detectable by an increasingly sceptical general public? TheLayoff.com has just responded to this.

An associate opines that fact #1 (that slop gets worse over time) is exacerbated by the flood of slop on the Net being snarfed up by newer bots and mistaken for training data. "Thus the feedback loop I mentioned a long time back and which Andy wrote about in depth." (He was referring to Dr. Farnell's good writings about this dilemma - as he did several times in The CyberShow's blog)

To a certain extent my Ph.D. thesis (dissertation) covered this about two decades ago. The associate says that it's a "well-known problem from days of old".

There are several unique aspects to this, including validation bias. To me it seemed a bit related to but not the same as over-training because, as an associate explains, "overtraining is something else: too much data and the patterns become locked too tightly to the training set and less useful for new data".

For an LLM to scan online its own output serves to affirm the mistakes, or the errors, often euphemised as mere "hallucinations", which are innocent, not libellous, and by no means "intentional" and "harmful". Dr. Farnell and Dr. Kate Brown responded to this last October in "Radical disbelief and its causes".

In the context of my thesis (dissertation), a concern was raised about what we back then called "synthetic data" finding its way "back" into the training set. So when you check brain MRI scans (which is what we did back then) you must ensure you only ever deal with real data, not mock or manipulated data that can confirm your own biases and "fit into" the model that generated it in the first place (in generative mode). To use the analogy of text-based LLMs, your BS is "truth" if your input is your own BS (output/s) and it would be deemed accurate, based on you (opposite of the notion of peer review in science). The associate correctly points out, based on a scan of my thesis (dissertation), that the strings "overtraining" and "over-training" are not in the dissertation, but we used different terms back then.

A squat toilet (also known as an Eastern, Turkish, Iranian or Natural-Position toilet). This one is in Turkey

"An LLM Ouroboros of shit", as the associate dubs it, would be statistical models (such as PDMs or AAMs*) treating computer-generated images as something from "the real world".

The so-called "generative hey hi" (genAI) "bros" won't allow the media to talk about such issues, at least if they can downplay the issues and deny/misportray them (in the media). But it's a real and growing problem. Its magnitude likely grows quadratically, not linearly. Just like other bubbles (overabundance based around hype), don't expect linear implosions. When it's gone (poof!), it's gone.

____

* PDM and AAM need expansion in the explanatory sense, not just words (in the acronyms). PDMs go back several decades ago they were invented or pioneered by the people who tutored me. They use mathematical, statistical models to perform multidimensional analysis of data variations, based upon principal component analysis (PCA). AAMs are an extension but with textures, not only points. This is really old stuff; even AAMs are over 23 years old; now the mainstream media pretends those are some kind of "revolution".

Other Recent Techrights' Posts

Trying to Silence Techrights Was a Huge Mistake
Peter Thiel attacked a publisher for asserting, correctly, that he was gay. Now everyone knows it.
The Register Bill
The Register MS - putting the "MS" in your centre of the universe
 
Certificate Authority Let's Encrypt Has Almost Gone Down to Zero, Nearly Totally Extinct in Geminispace, the Few Capsules Still Using It Are Spam/Dead/Stagnant
This represents another decrease for Let's Encrypt; the last decrease was last week
Not Much Left in News Cycles
To be very clear, this does not describe "Linux" anything; it's true in just about every facet of news, except the paid-for fake "journalism" about "hey hi" (sites getting paid explicitly to maintain or rekindle hype)
Throwing Away "Old" Computers (Mozilla and Other Climate Deniers)
Mozilla is not leftist
The UEFI 9/11 - Part VIII - Denial of Service and Selling Us WSL (Windows) Instead of "Risky" (Prone by Breakage by Microsoft) GNU/Linux
Restricted Boot (so-called 'SecureBoot') does not improve security. It is nothing but trouble. It's meant to trouble non-Windows users. In dual-boot setups, SecureBoot is a recipe for disaster because Microsoft keeps erasing or tampering with the boot sector, to paraphrase an associate
Slop is Extremely Rare in Geminispace, Slop Images Are Unheard Of (Despite Images Being Supported)
As long as Geminispace grows in terms of domains it's safe to predict the protocol will still be used in 2029 and hence Geminispace will turn 10
Links 07/09/2025: Robodebt Class Action, Fines, and Copyright Settlement
Links for the day
Links 07/09/2025: Yle Impersonated in Social Control Media, Boat-Attacking Orcas, Midjourney Sued Again
Links for the day
Slopwatch: LinuxSecurity, Linux Journal, and the Serial Slopper
Google won't tackle the issue because Google participates not only in relaying slop but also in generating lots of it
Links 07/09/2025: Google Fines in EU and "Your Internet Access Is at Risk"
Links for the day
Gemini Links 07/09/2025: Little Brother and Corporate Theatre
Links for the day
Links 07/09/2025: More Harms of Slop and Anthropic's Nightmare Scenario (Huge Legal Liabilities for Slop)
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Saturday, September 06, 2025
IRC logs for Saturday, September 06, 2025
Microsoft Sites Now Talking About September's Mass Layoffs at Microsoft
It's noteworthy that even Microsoft's MSN now covers the latest revelations about mass layoffs
Gemini Links 06/09/2025: SpellBinding Moving and "The Cloud" Ridiculed
Links for the day
Slopwatch: On "the Apology Industry", Chatbots (Punchbag for Customers), and Fake Articles About "Linux"
"news reporting priorities changed"
Links 06/09/2025: "Covid Incidence on the Rise" and Many Attacks on the Press Worldwide
Links for the day
Analogies for "Memory Safety" in Rust
Don't worry, it's Rust! It can do anything!
Nobody Denies That SecureBoot Will Cause Problems After September 11
Not even Microsoft
Gemini Links 06/09/2025: Infinite Scrolling and Posting from Emacs
Links for the day
Links 06/09/2025: GitHub Meltdown Over Slop, "U.S. Jury Says Google Should Pay $425 Million in Privacy Lawsuit"
Links for the day
Despite Its Severe Financial Problems Gnome Foundation Inc Paid Rosanna Yuen Over 100,000 Dollars Last Year
maybe relocation should be considered
The "Left" and the Right"
It poisons everything
Mozilla and Rust Are Not Leftists
they're part of the mass consumerism machine
Disposable to Microsoft
There is an extensive set of people who got used by Microsoft, only to be thrown away a month later or a year later or a decade later
The UEFI 9/11 - Part VII - This Coming Week Many PCs Will Refuse to Boot "Linux" (Because of Microsoft's Expired Certificate)
The real solution is, disable "secure boot" or "SecureBoot" while it's still possible. [...] Just like submarine patents, a lot of this problem was "hibernating" for a while
The Thing Nobody in Red Hat Wants to Talk About Openly
There is a real sentiment or worry among Red Hatters, Europeans and Americans in particulars (because of higher salary expectations)
Slopwatch: Small Parade of Fake News About "Linux" and Scams Borrowing the Name (or Word) "Linux"
In practice, LLMs are a risk
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Friday, September 05, 2025
IRC logs for Friday, September 05, 2025
Genini Links 05/09/2025: Community, ROOPHLOCH, and PITkit
Links for the day
Links 05/09/2025: Vaccine Sceptics Poison the Well, Two Exploited Vulnerabilities Patched in Android
Links for the day
Gemini Links 05/09/2025: Logitech Lift and DIY Gemini Servers
Links for the day
Links 05/09/2025: Sainsbury's Caught Spying on In-Store Shoppers and Microsoft "OpenAI is Using Legal Threats to Harass its Critics"
Links for the day
BASIC Predates Microsoft by Over a Decade, Microsoft-Controlled Sites Like The Register MS Don't Want You to Know This
The state of the media is really bad when it relies a lot on oligarchs' money and is appointing editors who are working for oligarchs
Brian Kernighan, "Only Third to Dennis Richie and Ken Thompson" (UNIX), Agreed With Someone Who Said Rust Was Just Hype, Should Not Replace C
17 hours ago
Reminder: Microsoft's "Secure Boot" Certificate for "Linux" Will be Expired in One Week
Many PCs won't manage to 'rotate' to another certificate
"Many of the Red Hat Employees Are Still Looking for Work"
Shame on IBM's CEO
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Thursday, September 04, 2025
IRC logs for Thursday, September 04, 2025
Microsoft Started With Code Literally From The Trash, Nothing Has Improved Since
The reality is, there are systems and code that are reliable. But they're not Microsoft's.
Hypothesis That New McKinsey/Microsoft Executive Inside Red Hat Will Outsource Research and Development Operations to India (Like They Do in IBM)
IBM is floundering
Slopwatch: Scams, Fake Articles About "Linux", Plagiarism, and Worse
Perhaps some time soon the LLMs or the "Big LLMs" will run out of money (to borrow) and go offline, leaving those slopfarms in a tough place