Bonum Certa Men Certa

Archiving Web Sites to Ensure They Last Decades, Not Years, Outliving or Outlasting Various Disruptive Events

Video download link | md5sum b29da11a5ae25c7597c459e8e4c320b2



Summary: Today we upload 15 years' worth of blog posts to the Internet Archive (IA), or close to 32,000 stories along with Daily Links; we suggest that other sites do the same in order to tackle 'Internet rot' and preserve information (otherwise there's room for obscene revisionism)

THE INTERNET won't stay around forever. The Soviets, back in the old days, tried to develop something similar to it. The Internet will probably survive the next decade or two, but fifty years is a stretch; as for the World Wide Web, it has already devolved into a transport layer for JavaScript and DRM, having been rendered bloated and malicious in practice (albeit not in theory; one can still produce elegant Web sites).



Earlier this year we moved to Gemini and more than a year ago we adopted IPFS, which is used to circulate daily bulletins and IRC logs in a decentralised fashion. Our IRC channels all became self-hosted (in our network) earlier this year -- an ambition that we've had for years but didn't get around to until Freenode collapsed.

Archiving a Web site isn't the same as format changes and protocol changes. It's also not about making more copies, especially if those copies are as vulnerable to censorship as one another. Here in this site we have some public domain (PD) works that are of relevance to us and can be accessed in gemini://. Most of the works, however, use a Creative Commons licence. We are not a curation site per se, but it helps to keep copies of historical material, such as antitrust material demonstrating Microsoft's crimes (as tactics barely change over time). Well, by Internet standards we have enjoyed a long span of 15 years (articles and daily links) and we remain active on the daily basis. The same is true for Tux Machines, which turns 18 this coming summer, so a lot of the material we have here is no longer available anywhere else, except the Internet Archive (IA).

A few years ago we started making site archives in IA and we also recommended the site to people, dubbing it the most important site on the Web. It's no eternal site however; as an associate of ours explains, "the IA is very important but it will succumb as the WWW is phased out in favor of obfuscated, proprietary JavaScript."

IA can barely cope with (e.g. spider/index/save/navigate) many of the "modern" Web. When you add DRM to the mix (EME), then it's not a "format-shifting'" task as that too becomes an impossibility. Sites need to evolve or perish, which may mean getting off the Web and one day planning for the demise of the Internet as a whole. Like IA, our associate explains, "archive.is is interesting, but it'll die one day. In the long run they will all pass away. In formal archives, one of the initial decisions the institution has to make about any given artifact is that of how long it shall be preserved for. Nothing lasts forever, but there are ways of stretching things out and the duration determines the methods of preservation."

For a site such as ours it makes sense to keep the material available for 50 years, which is maybe how much longer I can live (if I'm lucky).

"Media shifting will obviously be involved," the associate notes, "but at a loss for some items. The plan pre-dates AWA by a great many years."

Last weekend we turned 15. "Already in 15 short years," our associate remarks, "many whole sites are gone. And of the sites that remain, many have lost all their old articles in clumsy reorgs. Of that which is left, some of those have purged documents with "inconvenient" messages or themes... even Groklaw purged its comments. I suppose few to none of the Groklaw comments made it into the Library of Congress archives."

At the time of writing I'm still uploading 205 MB of archives (as shown in the video above). We hope it can inspire other sites to think ahead and do the same. It's not a big task and it's better done before it's "too late"...

Our associate concludes by saying that "many programmers and even engineers are conscientious in erasing anything "old" even important records. Now with electronic media, there is often only a single copy of anything any more and that introduces, obviously, a single point of failure. So in the old days, one could maintain a relevant personal or professional archive. Now those are all centralized and continue to exist only at the whim of participant consensus. Anyone with administrative privileges, can "tidy" up and easily erases the world's last copy of a standard or other evidence or similar material."

We are going to add more material to IA and it can be found here as that piles up along with some material that isn't ours.

Recent Techrights' Posts

Gemini Links 23/12/2025: Hydraulic Pressure Balance and mercury://
Links for the day
Techrights as 'Regulator' Against Runaway Trains
"Runaway trains" never scared us because we know that they, unlike us, don't think rationally
Social Control Media is Bots (Fake Traffic, Fake 'Engagement')
As per FORTUNE, 76% of Twitter is alleged to be bots now
"Major [IBM] Reductions Will Take Place Soon in Rochester MN"
Maybe that's just the latest office gossip
 
A Good End for a Fine Year
Today we saw some pleasant news online about the growth of GNU/Linux and more perils impacting Windows and XBox
Serial Sloppers Lost Momentum, Sites With "Linux" in Their Name Barely Bother Anymore
Will 2026 be the year slopfarms jump the shark?
Gemini Links 23/12/2025: "The sun is shinning" and "problem in the Butlerian Jihad setup"
Links for the day
Links 23/12/2025: "Over 8,700 News Articles Censored in Turkey in 2024" and "Photos Are Being Deleted From the Epstein Files"
Links for the day
Links 23/12/2025: That ‘Satisfying Click’ and Security Lapses, Car Bomb Kills Russian Lieutenant General Fanil Sarvarov
Links for the day
Links 23/12/2025: GNU Taler 1.3, US Regime Censors Television Again
Links for the day
Valve Can Bring More Users to GNU/Linux, But It Won't Bring Freedom
Steam is DRM
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, December 22, 2025
IRC logs for Monday, December 22, 2025
How the Slop (So-called 'AI') Bubble Will Burst Next Year
There are already talks about mass layoffs in January
"Generative AI Bubble Has Begun to Pop", Nvidia Rides “Circular Financing... a Strategy That Hearkens Back to the Dot-com Crisis”
For companies like Microsoft this may mean another 30,000+ layoffs next year
Microsoft-Connected Media Talking About XBox Division "Profit Margins" is Distraction From XBox Sales Collapsing 70% in One Year
The simple fact is, Microsoft's console is dead in the water
The Reality is "Vibe Code" (Slop) is That It's Worthless
“Confidently Wrong”
British Web Developers Can Probably Ignore Firefox Users (Based on US Standards)
Mozilla has managed to piss off enough people
On the 'Digital Gulag' of 'Secure Boot' and Microsoft Disguising Its Attacks on Users as "Security"
Dr. Andy Farnell has this new article
Slopfarms Can Only Survive in Google News, Which is Still Promoting Them
Google News promoted only 3 slopfarms today
Gemini Links 22/12/2025: Films, Creativity vs. Consumption, Slop in YouTube
Links for the day
Microsoft XBox Losing Money, Layoffs and Studio Shutdowns (As Well as Price Hikes) Not the Solution
Microsoft does not quite talk about profits
Links 22/12/2025: Data Breaches, deterioration in Politics, and Geminispace
Links for the day
Links 22/12/2025: North Korean Applicants Target GAFAM (Amazon), ‘Orwellian Climate of Fear’ of CPC (Even Outside China)
Links for the day
More IBM Layoffs in India
It's not as simple as "laid off to be replaced by an Indian"
GAFAM Deeply Connected to Jeffrey Epstein, Richard Stallman (RMS) in No Way Connected to Jeffrey Epstein
people who hoarded all the capital get to decide what people think and say
Linus Torvalds Has a Birthday This Coming Weekend, Thankfully He Still Controls His Main Project
GNU and Linux should remain under their control as long as they live
Mozilla is Getting Attention for All the Wrong Reasons, Take a Look at LibreWolf
Just last week Mozilla added a new top-level manager who (as usual) came from a "tech giant"
When Conformism Means Capitulation and Defeat
In an age of injustices like these, we all have some kind of moral obligation not to be conformist.
Text is Still King
But the so-called 'industry' insists that we should download 10 MB of objects from multiple domains... even just to read 5-10 paragraphs of text
Links 22/12/2025: Facebook "Testing $14.99 Monthly Subscription Fee to Post Links" and "Middle East Petrostates as American Media Owners"
Links for the day
Beyond the World Wide Web (WWW)
We continue to treat Gemini Protocol as a first-class citizen
Serbia: GNU/Linux Rises, Windows Down to All-Time Lows
According to statCounter
"Wrestling With Pigs"
"Never wrestle with a pig. You both get dirty, and the pig likes it."
Productive Year and Better Access to Techrights' Archives Going Back to 2006
we've long needed and wanted native, local, independent search facilities
Linux Abandoned by Linux Foundation
It speaks for Microsoft and for so-called 'AI' companies
Microsoft Has Practically Given Up on XBox Already
Expect many XBox related layoffs when 2026 starts (Q1)
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, December 21, 2025
IRC logs for Sunday, December 21, 2025
"Today's [Red Hat] is run by a cabal of vultures."
it seems safe to assume Red Hat too will languish away
Microsoft Layoffs in 2026 Can be Bigger Than 2025 Microsoft Layoffs (30,000+ Workers Laid Off)
"Is there going to be any reorg or Microsoft layoffs?"
Gemini Links 21/12/2025: Solstice, Chaos of CSS, and Program Interpreter Fun
Links for the day
The Free Software Foundation (FSF) Represents People, Not Corporations
FSF isn't in the "business" of appeasing oligarchs
Why?
Why write articles?
Microsoft-Connected Publisher Spinning XBox's Death Spiral (It's Dying Fast) as a Strength and Something Deliberate
"Microsoft’s big gaming pivot"
Slop is Rare by Now
A year ago slop was so abundant that we did a whole series about it, and it was daily
Links 21/12/2025: U.S. Strikes in Syria, "Epstein Files Photos Disappear From Government Website"
Links for the day
Gemini Links 21/12/2025: Labrador Retriever of Lagrange's Developer Dies From Cancer, Political Philosophy, and "Getting to Inbox Zero"
Links for the day
IBM: We Can't Make 'AI' (Voice Recognition) Do the Work of a McDonald's Teenager, So Let's Try the Same on Saudi Planes
IBM is lost. It's truly lost.
Microsoft is Becoming Irrelevant: The Case of Georgia
Not Georgia Tech
Sirius Open Source is Now Imminently Dead (Struck Off)
compulsory strike-off
Dr. Richard Stallman, Invited by LibreTech Collective, is Giving a Public Talk in Georgia Tech Next Month (Scheller College of Business)
They can probably squeeze about 400 people into this room
25 Years of Activism for GNU/Linux
My passion for GNU/Linux brought a lot of contentment
Africa, Where Microsoft Used De Facto Slaves to Pretend to be "AI", Chatbots Usage is 0.2% of Measured Online Traffic
Judging by recent trends in Africa, many "Windows PCs" are being converted into GNU/Linux computers
New Drone Footage Shows IBM is Dead (Parts of It)
The people who participated in IBM when IBM actually mattered probably have boasting rights, unlike people who work for IBM today
Michael Larabel Adds Slop Category to Phoronix, Quickly Realises That It's Worthless
Phoronix nowadays gets carried away; it made a new category to talk about slop and it decided to call it "intelligence" with some caricature of a brain (that's misleading)Phoronix nowadays gets carried away; it made a new category to talk about slop and it decided to call it "intelligence" with some caricature of a brain (that's misleading)
After 35 Years the World Wide Web, HTML, and HTTP Are Proprietary
HTTP/2 added a lot of complexity (it's just a Google protocol, based on SPDY originally), many image formats are proprietary and patented, HTML got 'replaced' by Java-Scripts [sic], and many URLs (the URL system was created in the early 90s) are just long strings for proprietary 'webapps'
The General Public License (GPL) Inspired the Web's Original Openness/Freedom, According to Tim Berners-Lee
"During the preceding year I had been trying to get CERN to release the intellectual property rights to the Web code under the General Public License (GPL) so that others could use it."
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Saturday, December 20, 2025
IRC logs for Saturday, December 20, 2025