Bonum Certa Men Certa

The Great LLM Delusion - Part IV: Academic Papers as Microsoft Marketing for LLMs

posted by Roy Schestowitz on Jan 31, 2024

Photo capturing two very different architectural styles for student buildings at the UCLA Campus, California

"Strange and misleading article about LLMs," as explained by an anonymous contributor

THIS series coincides with Microsoft hype and vapourware (much-needed distractions). This part is a guest post of sorts, an unedited version of something a reader sent us. The supporting material is in there.

To be clear, you can talk to a chatbot. The chatbot won't talk back to you. It'll just spew out words, some of them merely plagiarised based on words similar to what you said (which the chatbot does not grasp, it's more like a Web search, except there's no attribution/link to source). Chatbots might lower the bar for journalism - to the point where Web as a whole will lose legitimacy, trust, value etc. Then what? Going back to physical libraries? Saying you can compose a physical book using a chatbot is like saying you can make a very large meal by assembling trash and cooking parts of it (LLMs are "digital pagpag"). Given reports like "Scammy AI-Generated Book Rewrites Are Flooding Amazon", this is already a real problem. "These 'AI' stock increases based on fake increases in revenue," an associate has remarked, "appear funded by mass firings to appease the LARPers in the financial community, That can only go on so long before they run out of people to take care of the core income-generating activities, a line which I suspect they have already crossed."

Will Microsoft also start spewing out "papers" or "publication" made by its chatbots, in order to generate hype about chatbots? That probably would not work, as the quality would not meet basic criteria.

Without further ado, here is the contributor's message:


I stumbled upon a recent article you may find curious.

While reading comments on a post at Bruce Schneier's blog, I saw a user who posted the following link as a kind of "proof" that conversations with LLM-based chatbots can be "useful" and "interesting."

Of course, it sparked my interest at first, but as I started reading it, red flags started to pop up here and there.

I do not know much about Quanta Magazine's credibility. At first, I thought that it was some semi-crackpot pop science news site, but after a shallow search, I saw a good rank from a fact-checking site.

The article was published on January 22, 2024, and the research it discussed was released on October 26, 2023. May be it does not mean much—just a few months—but it is a bit suspicious that the research paper is apparently not peer reviewed (just published on arXiv and cited in ~2–3 sources), and the article about it came out in parallel with "AI" swindle failure unraveling.

It seems like the article is desperately trying to spark new interest in readers regarding LLMs and chatbots, saying that there is some evidence that there is "much more than just autocomplete."

Following are some dubious parts.

1. The article talks about the "understanding" of something by LLMs but presents no clear definition of it.

The thing that can pass as a semi-definition (from the research paper)—"combinations that were unlikely to exist in the training data"—is, in my opinion, misleading for ordinary people. Much like other misnomers in the field (e.g., "hallucinations").

I guess it may be suitable to talk about "competence" instead, as in the "competence without comprehension" phrase from Dennet's writings.

2. The paper described in the article seems to support (or go in the direction of) the vague idea that if you shovel a lot of data and complexity into "AI" (LLM in this case), then "something" will emerge ("skills" and "ability to generalize" in this case, as stated in the paper and researcher's comments in the article). I find it concerning.

3. "Research scientist at Google DeepMind" among the authors of the paper, so it is probably not clearly independent (from corporate influence) research.

4. “[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”

Wait, what? Why is this part inserted in the article at all? Some guy from Microsoft is eager to tell us that LLMs are "something more." No bullshit. What a surprise!

5. The paper starts with this passage: "With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents..."

I mean, what the fuck?! LLMs are not "shifting" anywhere; they are poorly shoehorned into use cases where a "general-purpose AI agent" is required (whatever it is, it does not exist in our reality anyway) by people who want to reap profits from selling half-assed "products" based almost entirely on lies! LLMs are definitely not suitable for general-purpose tasks other than text manipulation or some kinds of entertainment where facts, preciseness, and responsibilities do not matter at all.

One of the researchers acknowledges that it is not about accuracy.

"Arora adds that the work doesn’t say anything about the accuracy of what LLMs write. “In fact, it’s arguing for originality,” he said. “These things have never existed in the world’s training corpus. Nobody has ever written this. It has to hallucinate.”

I need to make it clear: I have no competence to review the actual paper; this task requires actual experts in the field.

As far as I understand the paper, the researchers devised some abstractions to describe observations they already made and try to construct a method that would be useful to work with their definitions and hypotheses that have a little in common with laymen's definitions (e.g., for terms like "understanding" and "creativity") and perceptions of the matter.

I tried to read the paper with an open mind to avoid at least some obvious biases. I have no problems with the paper; maybe it is actual useful research that will serve to advance the field (and not the companies of con artists)—I cannot say for sure.

What bothers me are the misnomers, misleading, and vague terms and descriptions in the paper (less) and the article (a great deal) based on it. In my opinion, the article commits the crime of severely misinforming the reader.

Other Recent Techrights' Posts

When the Microsoft Aggressors Rely on Several Law Firms ('Attack Dogs', 'Guns for Hire'), Not Just One, Lawyering Up Against Techrights (Acting on Behalf of Americans Against UK Publishers)
From serving customers at some restaurant he has moved on to bullying people with demand letters
Polygamy, from Catholic Synod on Synodality to Social Control Media & Debian CyberPolygamy
Reprinted with permission from Daniel Pocock
Only a Third of or 1 in 3 Web-Connected Devices is a Desktop or Laptop, According to statCounter
we can expect Android to widen its lead
 
statCounter Estimates Only 1 in 300 Iranians Would Use Microsoft for Search
Iranians don't quite trust Microsoft
Gemini Links 24/06/2025: ftpd on FreeBSD and Online Small Web Magazine
Links for the day
Google News Does Great Harm by Promoting Slopfarms as Legitimate News Sites
Slopfarms are sites which are 100% LLM slop
Links 24/06/2025: Trouble at "Open" "AI" and ‘Siarhei is Free’
Links for the day
Gemini Links 24/06/2025: Stimulants and Subscription Costs for DRM
Links for the day
Links 24/06/2025: OpenAI [sic] May Soon Die (Too Much Debt) and Social Control Media Accused of Being Misinformation/Disinformation/Propaganda Amplifier
Links for the day
Nirbheek Chauhan in Planet GNOME Explains Why Wayland Pushers Are Losing
"A strange game. The only winning move is not to play."
The Days Are Getting Shorter, the First Half of 2025 is Almost Over
We're gratified to see significant increase in traffic and also positive feedback on the work we do
Turning GNU/Linux Into a Political Football
X (not the site) is Free software
X Server Still Works for Many People
A lot of people will grow suspicious of Wayland boosters/pushers if they persist and insist on using these tactics
Exactly a Week Ago "BetaNews Staff" Said "Betanews Is Growing Alongside You". Since Then Every Article (All by "Camila Nogueira") Has Been LLM Slop.
BetaNews is basically a slopfarm
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, June 23, 2025
IRC logs for Monday, June 23, 2025
The "Tarzan Effect" in Compilers and Software
What happens when you forcibly make things 'work', either by hacks or by disregarding warnings (like those that compilers tend to issue)?
Gemini Links 23/06/2025: Mass Tourism, Hair Love, and Google Gemini as a Googlebomb
Links for the day
Law Firm Burgess Mee Does Not Fully Deny Participating in Abusive Litigation for Serial Strangler From Microsoft
I am not unfamiliar with these tactics
The Modus Operandi of Wayland Pushers: Make It Political
do what I say or you're a nazi...
Links 23/06/2025: RFE/RL Contributor Vladyslav Yesypenko Released, Recording Industry Cutbacks
Links for the day
Brett Wilson LLP Solicitors (M): Over 99.9% of Our E-mail is Self-Marketing, We Send You 3.5MB E-mails for Less Than 1KB of Text
Why would tech people entrust legal matters to such people?
Peter Moon's (Computerworld) Interview With Richard Stallman
Stallman: If you want freedom don't follow Linus Torvalds
At What Point Does Outsourcing Constitute Malpractice?
Brett Wilson LLP's new staff page is misleading
United Arab Emirates (UAE) Sailing to GNU/Linux, According to statCounter
countries in that region will quickly learn the price of neglecting digital sovereignty
From Do Your Own Research to Do Your Own Search
The Web is full of garbage; search engines amplify this garbage
More People Moving to Geminispace?
at age 6+ Gemini Protocol seems to have gained some maturity and it seems like more people use it
Permutation in LLMs Does, Inevitably, Change Meanings and Therefore LLMs Cannot Properly Rephrase or Summarise Texts
LLMs lack actual grasp or comprehension of what they spew out
Links 23/06/2025: Many Security Breaches, Population Declines
Links for the day
Gemini Links 23/06/2025: "America at the Crossroads" and OpenWRT Surgery
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, June 22, 2025
IRC logs for Sunday, June 22, 2025
Pure Dove
Different means different, and sometimes those who "deviate" from "the norm" have a point
Censorship is a Sign of Weakness Which Invites More Censorship Attempts
revolutionaries don't succumb to pressure from bullies
Why It's Unlikely That LLM Slop Will Dominate the Web in the Long Run
Slopfarms will eventually perish (they have no actual value) and "survivors" on the Web will be sites that never depended on search engines and social control media
GNU/Linux in Argentina Now Measured Near 5%
Like in central Europe, they must be seeing an increasingly hostile US
BetaNews is Fake News, Composed by LLM Slop
nothing in BetaNews is written by humans anymore
Links 22/06/2025: Giving Up on Smartphones and 'Jaws' at 50
Links for the day
Gemini Links 22/06/2025: Furniture Construction and Bubble for Comments
Links for the day
Links 22/06/2025: Windows TCO Tales and YouTube Getting More Hostile to Users
Links for the day
The FSF Board and FSF Beard
So the FSF's Board has grown
Law Firms Facing the Consequences for Patently Abusive Litigation on Behalf of Microsoft Employees Who Got Arrested for Strangulation and Had Done Even Worse Things
Having spent 1.5 years bullying me with patronising letters on behalf of Microsofters, last week they got served a massive bill and, in effect, lost the Hearing
New Report From the EPO's Staff Representatives in The Hague (LSCTH) Reveals Many Unsolved Issues
Local Staff Committee The Hague (LSCTH) wrote to staff just before the weekend
LLMs Breaking Everything
Computing and the Net became a playground for scammers and "bros", like people who "invented" fake currencies and also try to tell us that LLMs spewing out things will have some real value
Links 22/06/2025: More Slop Lawsuits (Copyrights) and "America’s Oligarch Problem"
Links for the day
Gemini Links 22/06/2025: Gigantic Toolchest and Annoying Bots
Links for the day