Bonum Certa Men Certa

Archiving Web Sites to Ensure They Last Decades, Not Years, Outliving or Outlasting Various Disruptive Events

Video download link | md5sum b29da11a5ae25c7597c459e8e4c320b2



Summary: Today we upload 15 years' worth of blog posts to the Internet Archive (IA), or close to 32,000 stories along with Daily Links; we suggest that other sites do the same in order to tackle 'Internet rot' and preserve information (otherwise there's room for obscene revisionism)

THE INTERNET won't stay around forever. The Soviets, back in the old days, tried to develop something similar to it. The Internet will probably survive the next decade or two, but fifty years is a stretch; as for the World Wide Web, it has already devolved into a transport layer for JavaScript and DRM, having been rendered bloated and malicious in practice (albeit not in theory; one can still produce elegant Web sites).



Earlier this year we moved to Gemini and more than a year ago we adopted IPFS, which is used to circulate daily bulletins and IRC logs in a decentralised fashion. Our IRC channels all became self-hosted (in our network) earlier this year -- an ambition that we've had for years but didn't get around to until Freenode collapsed.

Archiving a Web site isn't the same as format changes and protocol changes. It's also not about making more copies, especially if those copies are as vulnerable to censorship as one another. Here in this site we have some public domain (PD) works that are of relevance to us and can be accessed in gemini://. Most of the works, however, use a Creative Commons licence. We are not a curation site per se, but it helps to keep copies of historical material, such as antitrust material demonstrating Microsoft's crimes (as tactics barely change over time). Well, by Internet standards we have enjoyed a long span of 15 years (articles and daily links) and we remain active on the daily basis. The same is true for Tux Machines, which turns 18 this coming summer, so a lot of the material we have here is no longer available anywhere else, except the Internet Archive (IA).

A few years ago we started making site archives in IA and we also recommended the site to people, dubbing it the most important site on the Web. It's no eternal site however; as an associate of ours explains, "the IA is very important but it will succumb as the WWW is phased out in favor of obfuscated, proprietary JavaScript."

IA can barely cope with (e.g. spider/index/save/navigate) many of the "modern" Web. When you add DRM to the mix (EME), then it's not a "format-shifting'" task as that too becomes an impossibility. Sites need to evolve or perish, which may mean getting off the Web and one day planning for the demise of the Internet as a whole. Like IA, our associate explains, "archive.is is interesting, but it'll die one day. In the long run they will all pass away. In formal archives, one of the initial decisions the institution has to make about any given artifact is that of how long it shall be preserved for. Nothing lasts forever, but there are ways of stretching things out and the duration determines the methods of preservation."

For a site such as ours it makes sense to keep the material available for 50 years, which is maybe how much longer I can live (if I'm lucky).

"Media shifting will obviously be involved," the associate notes, "but at a loss for some items. The plan pre-dates AWA by a great many years."

Last weekend we turned 15. "Already in 15 short years," our associate remarks, "many whole sites are gone. And of the sites that remain, many have lost all their old articles in clumsy reorgs. Of that which is left, some of those have purged documents with "inconvenient" messages or themes... even Groklaw purged its comments. I suppose few to none of the Groklaw comments made it into the Library of Congress archives."

At the time of writing I'm still uploading 205 MB of archives (as shown in the video above). We hope it can inspire other sites to think ahead and do the same. It's not a big task and it's better done before it's "too late"...

Our associate concludes by saying that "many programmers and even engineers are conscientious in erasing anything "old" even important records. Now with electronic media, there is often only a single copy of anything any more and that introduces, obviously, a single point of failure. So in the old days, one could maintain a relevant personal or professional archive. Now those are all centralized and continue to exist only at the whim of participant consensus. Anyone with administrative privileges, can "tidy" up and easily erases the world's last copy of a standard or other evidence or similar material."

We are going to add more material to IA and it can be found here as that piles up along with some material that isn't ours.

Recent Techrights' Posts

LLM Slop is Not Reliable, Constitutes No Process of 'Thinking'; There's No Thought Process at All, No Grasp or Understanding, Let Alone Context
Lies have become the "business model" [...] More people ought to talk about it and explain to other people what LLMs really are
Not a Security Expert If You Cannot Manage to Keep Online a Simple Two-User Mastodon Instance Somebody Else Built
From uptime of ~99% to maybe 80%
Microsoft Has All the Symptoms of a Dying Company (Mass Layoffs of the People Who Built the Company)
the company's debt is going through the ceiling
For Effective 'Finlandisation' (Not Digital Sovereignty) to Be Replaced by Autonomy Finland Needs to Think Like GNU (Software Freedom), Not Linux (Openwashing Source, Plus LLM Slop and Killswitches)
What is 'Finlandisation'?
IBM's Kyndryl in Trouble: Mass Layoffs, Payroll Problems, Buybacks (in Company Whose Debt is Almost Twice Its Total Value), and Soon $9 Per Share (Down Over 80%)
Kyndryl is done. Stick a fork in it.
ICYMI: GNU/Linux Did Not Start in Finland
If we're honest/true to ourselves, we need to recognise history for what it is, not what some corporations (like GAFAM) want it to be
Codecs and Software Patents - Part VII - Entering Phase II, the Battle Against Companies That Normalise Taxed (by Patents on Mathematics) Codecs
In the next few part we'll deal with the impact on Free software, including the GNU Project
 
IBM Keeps Culling Essential Linux, Fedora, GNOME, and GTK Staff
Over a month ago IBM laid off over 400 Red Hat engineers
Cisco Cuts Nearly 4,000 Jobs Because of Debt, Nothing to Do With Slop
The media keeps talking about revenue, not profits
Gemini Links 15/05/2026: UDP Game Forwarding Over SSH, Avoiding LLMs, and Alhena 5.5.9
Links for the day
Links 15/05/2026: Electric Company Shuns Entire Town to Prioritise Only Data Centres, Saudi Arabia and U.A.E. Carried Out Secret Attacks in Iran
Links for the day
Focus is Important, Focus is Everything
We are still running 6 multi-part series in tandem
Guest Post on False Marketing and PR Blitzes by Anthropic
A lot of people my age are just tired of the nonsense
Links 15/05/2026: UK antitrust regulator is officially investigating Microsoft Office, Anthropic’s Fraudulent Lies About Mythoslop Don't Withstand Scrutiny
Links for the day
IBM is Googlebombing the Media With Fake Numbers to Promote Fake Technology
a classic example of why much of today's media cannot be trusted (anymore)
Up to 10,000 Microsoft Layoffs in a Couple of Months
Many ways to skin a cat
Truth Hurts. People Hurt by Truth Aren't Entitled to Compensation.
Family members aren't exempt
SLAPP Censorship - Part 77 Out of 200: They Never Knew How to Handle Women (Except to Attack Them)
The case against us was really quite simple
Update on Sirius Open Source in 2026 (When Your Former Employer Commits Crimes and Nobody is Held Accountable)
I did not envision myself spending several years (even 4 years after leaving that company) challenging the system for tolerating and even covering up corruption
The Corrupt Lecture the Non-Corrupt - Part XXIII - Cocaine Use at the EPO's Top-Level Management "Adds Up" and Worsens Things "Over Time"
"cocaine use knocks the IQ down permanently a tiny bit with each use. Over time that adds up."
Gemini Links 15/05/2026: Slop Fatigue and Banning LLM Use
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Thursday, May 14, 2026
IRC logs for Thursday, May 14, 2026
Links 14/05/2026: Health Science, Cheeto Meets Pooh, and Facebook Staff Loathing the CEO
Links for the day
Gemini Links 14/05/2026: Early Morning Practice and Number to Roman Numeral Converter
Links for the day
FSF Advertises the Father of Software Freedom Giving a Talk in Germany (a Digital Sovereignty Interest Hub, Sponsor of Free Software)
Free Software vs malware and the need for reverse engineering
Cybershow (UK) Shaping Up to be a Neat and Very Large Gemini Capsule
If only more platforms did the same, plenty of energy would be spared, "old" machines would be totally suitable (even with 20 tabs open), as we'd focus on substance, not bells and whistles
SLAPP Censorship - Part 76 Out of 200: The Problem With the United Kingdom Allowing Americans to File Lawsuits by Proxy (Relayed by "Hired Guns")
Solicitors in UK warned not to act as ‘hired guns’ to silence critics of super-rich
When Microsoft's LinkedIn Goes Offline All Your Fake Friends/Connections and Manufactured 'Status' Will be Gone
Many people quit social control media because they recognise it for what it truly is
Major Setback for IBM in the Courtroom, the Demolition of IBM is Proving Costly
Kyndryl is a sign of how IBM ("mother ship") is run and where IBM is heading
Links 14/05/2026: Willful Ignorance and Mass Layoffs at Microsoft
Links for the day
Gemini Links 14/05/2026: Rewatching V for Vendetta, JPEG XL, and Platform Migrations
Links for the day
The Corrupt Lecture the Non-Corrupt - Part XXII - What the Science Says About Cocaine in the Workplace (EPO President, Mr. Campinos, Please Take Note)
What the science says
European Patent Office (EPO) President, Mr. Campinos, Ignoring Its Staff While Protecting His Friends
the President is covering up cocaine use while ignoring his own workers
Slop Cannot Replace Everybody (the Story of Perl and Universities)
Quantity where abundance exists is without merit; quality is what people opt for as they have limited time and patience
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Wednesday, May 13, 2026
IRC logs for Wednesday, May 13, 2026
Links 13/05/2026: Sudan War Enters Fourth Year and Strait of Hormuz Leaves Safe Passage a Gamble
Links for the day
Gemini Links 13/05/2026: Useless Protests and Foofaraw on Geminispace
Links for the day
Mainstream Media: Microsoft Says No Layoffs. Microsoft: OK, There Are Layoffs.
Where is Waggener Edstrom/Frank Shaw now?
IBM's Kyndryl Down Almost 20% in 5 Days, IBM Down 35% in About 6 Months, Further 'Staff Reductions' at Red Hat (Problems Paying Salaries!)
Will this year's festivities be Krishna's last?
More Mass Layoffs at Microsoft, Only Weeks After the "Buyout" Nonsense (Glorified Severance to Highest-Paid American Staff)
Next up it is LinkedIn
IBM is in a Freefall, When Will IBM's CEO Fall on His Sword?
Since he controls the Board, is anyone in a position to fire him?
At GitLab, "AI" is "All India"
It says "as much as 30%," but they also hire and it's clear what demography is targeted
Verified Accounts of Microsoft Offering 'Retirement' (Layoffs) to People in Their 40s, Over Two Decades Earlier Than Retirement Age
It's not even about performance, it's about age (or "cost" as well as location; they cheapen the labour)
Links 13/05/2026: Slop Turns Into 2008-Style Subprime Bubble, Mass Layoffs at Starbucks
Links for the day
They Don't Like the Layoffs, So They Are Rebranding Them
Layoffs are layoffs
IBM Downgraded as the Shares Sink to New Lows
The current strategy of IBM is financial engineering, wage reductions, and mass layoffs that the corporate media refuses to even write about
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, May 12, 2026
IRC logs for Tuesday, May 12, 2026
Gemini Links 13/05/2026: TUIs and Internet Radio
Links for the day
How the European Patent Office Became a Crime and Corruption Hub, One of Europe's Biggest
incomplete outline