Wikipedia Can Lower Its Hosting Bill by Going More Static, Not Just by Caching, But It Would Not Solve Its Biggest Problems (Bribes and AstroTurfing)
LLMs are not the biggest problem at Wikipedia
2008: Microsoft Agents from Waggener Edstrom Airbrush Wikipedia, Glorify Paymaster
For about 15 years we had a Wiki in this site (it's still there, but not as a wiki; it's a wikidump, i.e. static pages). We adopted the same software developed and used by Wikipedia. Spam and vandalism meant we just had to limit who can edit; script kiddies did so much damage (defacing or adding bunk pages) that rolling back changes became a chore, even while on holiday.
By 2023 it also became a nuisance due to read-only bots; online spiders/scrapers would constantly hit unique ('on-demand') pages that made no sense; no person would repeatedly request those. Caching would not help much if many different "pages" were repeatedly requested (sometimes hundreds of times per second), invoking the back end (MariaDB/MySQL and PHP) so many times for no good reason. At times we got thousands of requests per second. That's just too much, even for a decent router.
Wikipedia recently bemoaned LLM scrapers; it really ought to moan about LLMs distorting Wikipedia articles. Moreover, Wikipedia ought to complain about: 1) Microsoft and Bill Gates bribing Wikipedia [1, 2, 3, 4] to be passive while they distort Wikipedia articles (for PR purposes, revisionism/lies/selective omissions as "articles"); 2) Microsoft providing servers and money for LLM scrapers, such as those which harm Wikipedia (overwhelming the back end).
Wikipedia isn't a site of integrity; not anymore. I wrote a great deal about Wikipedia over the years. More than 16 years ago the cofounder of Wikipedia openly blasted Microsoft for bribing people to edit Wikipedia articles (interjecting Microsoft lies/spin or 'guarding' pages of interest against facts). Nowadays this cofounder (the greedy one, Wales, not Sagner) would simply look the other way while his bank balance grows.
If Wikipedia is serious about lowering its hosting bills (its financial disclosures show that this expense is very minor compared to other things) or making the site faster/more resilient, then it should consider becoming more like Britannica, which is a lot trickier for corporations to manipulate.
As a kid I used Britannica and other encyclopedias quite a lot. Nowadays I feel disillusioned and dissatisfied about the "page anyone can manipulate" approach; it's becoming a lot more like Social Control Media, not literature. It's not about what's true but about "brigading" and persistence/perseverance (or budget).
Wikipedia needs to get its act together or lose what's left of its former reputation. Britannica isn't a good yardstick, but in my experience it's nowadays a lot more accurate and reliable than Wikipedia, where many articles are "unfinished works" or ads disguised as legitimate pages (sometimes people or companies writing about themselves).
Demoting or altogether abandoning Wikipedia isn't easy; people have nostalgic memories (sentimental facets) about what it used to be, however a lot has changed. Many ordinary people "contributed" to Wikipedia (edits, funds etc.), so rejecting Wikipedia feels like self-loathing or self-betrayal.
Wikipedia has new masters. They work against you. Wikipedia is just another "Advertising Channel" to them ("Reputation Management"). For only a little money ("slush funds") they can get a lot out of it. It's another "investment". █