Bonum Certa Men Certa

Archiving Web Sites to Ensure They Last Decades, Not Years, Outliving or Outlasting Various Disruptive Events

Video download link | md5sum b29da11a5ae25c7597c459e8e4c320b2



Summary: Today we upload 15 years' worth of blog posts to the Internet Archive (IA), or close to 32,000 stories along with Daily Links; we suggest that other sites do the same in order to tackle 'Internet rot' and preserve information (otherwise there's room for obscene revisionism)

THE INTERNET won't stay around forever. The Soviets, back in the old days, tried to develop something similar to it. The Internet will probably survive the next decade or two, but fifty years is a stretch; as for the World Wide Web, it has already devolved into a transport layer for JavaScript and DRM, having been rendered bloated and malicious in practice (albeit not in theory; one can still produce elegant Web sites).



Earlier this year we moved to Gemini and more than a year ago we adopted IPFS, which is used to circulate daily bulletins and IRC logs in a decentralised fashion. Our IRC channels all became self-hosted (in our network) earlier this year -- an ambition that we've had for years but didn't get around to until Freenode collapsed.

Archiving a Web site isn't the same as format changes and protocol changes. It's also not about making more copies, especially if those copies are as vulnerable to censorship as one another. Here in this site we have some public domain (PD) works that are of relevance to us and can be accessed in gemini://. Most of the works, however, use a Creative Commons licence. We are not a curation site per se, but it helps to keep copies of historical material, such as antitrust material demonstrating Microsoft's crimes (as tactics barely change over time). Well, by Internet standards we have enjoyed a long span of 15 years (articles and daily links) and we remain active on the daily basis. The same is true for Tux Machines, which turns 18 this coming summer, so a lot of the material we have here is no longer available anywhere else, except the Internet Archive (IA).

A few years ago we started making site archives in IA and we also recommended the site to people, dubbing it the most important site on the Web. It's no eternal site however; as an associate of ours explains, "the IA is very important but it will succumb as the WWW is phased out in favor of obfuscated, proprietary JavaScript."

IA can barely cope with (e.g. spider/index/save/navigate) many of the "modern" Web. When you add DRM to the mix (EME), then it's not a "format-shifting'" task as that too becomes an impossibility. Sites need to evolve or perish, which may mean getting off the Web and one day planning for the demise of the Internet as a whole. Like IA, our associate explains, "archive.is is interesting, but it'll die one day. In the long run they will all pass away. In formal archives, one of the initial decisions the institution has to make about any given artifact is that of how long it shall be preserved for. Nothing lasts forever, but there are ways of stretching things out and the duration determines the methods of preservation."

For a site such as ours it makes sense to keep the material available for 50 years, which is maybe how much longer I can live (if I'm lucky).

"Media shifting will obviously be involved," the associate notes, "but at a loss for some items. The plan pre-dates AWA by a great many years."

Last weekend we turned 15. "Already in 15 short years," our associate remarks, "many whole sites are gone. And of the sites that remain, many have lost all their old articles in clumsy reorgs. Of that which is left, some of those have purged documents with "inconvenient" messages or themes... even Groklaw purged its comments. I suppose few to none of the Groklaw comments made it into the Library of Congress archives."

At the time of writing I'm still uploading 205 MB of archives (as shown in the video above). We hope it can inspire other sites to think ahead and do the same. It's not a big task and it's better done before it's "too late"...

Our associate concludes by saying that "many programmers and even engineers are conscientious in erasing anything "old" even important records. Now with electronic media, there is often only a single copy of anything any more and that introduces, obviously, a single point of failure. So in the old days, one could maintain a relevant personal or professional archive. Now those are all centralized and continue to exist only at the whim of participant consensus. Anyone with administrative privileges, can "tidy" up and easily erases the world's last copy of a standard or other evidence or similar material."

We are going to add more material to IA and it can be found here as that piles up along with some material that isn't ours.

Recent Techrights' Posts

SLAPP Censorship - Part 22 Out of 200: When You Complain People Impersonate You in IRC (But You Yourself Impersonate People in IRC and Lock Them Out of Their IRC Handles)
We'll cover this with direct evidence some time soon
The Empty Suits of IBM Managers (NIH or "Nothing Invented Here")
IBM's management adopted the business model of parasites
Dr. Stallman’s Work Will Never be Considered 'Mainstream' Because He Rejects and Works Against the So-called 'Mainstream'
Try to be more like Stallman
EPO "Cocaine Communication Manager" - Part IX - Cocaine Addicts in Charge of the EPO Attacking Families of EPO Staff
Things like being high-profile and being a serious drug addict aren't opposites
 
Gnome Foundation Inc is in Trouble
the agenda is set GAFAM and IBM rather than donors
Links 25/03/2026: Airports Further Militarised, "Slopification and Its Discontents", Microsoft 'Open' 'Hey Hi' Shutting Things Down
Links for the day
Gemini Links 25/03/2026: Blogging Fright and Absolutely Useless 'Apps' Made by Slop Machines
Links for the day
Rise in Energy Prices Will Significantly Accelerate the Death of So-called "AI Companies"
It should be noted that fake news about Microsoft OpenAI doubling workforce (mere words, not actions) can serve as a nice distraction from the death of Sora due to divestment
It's Always a Question of Trust
There's a widespread stigma of lawyers being manipulative and chronically dishonest
Solicitors Regulation Authority (SRA) Must More Carefully Investigate or Assess the Financial State of Law Firms in the UK
We'll cover this in depth in the future
GAFAM Mozilla Removes Theora Support, Now GNU Needs to Re-encode Videos
Mozilla used to mean something to Free software advocates
An Open Admission Profits Depend on Addiction
Proprietary software tends to be like this
IBM Americas President Ayman Antoun Comes to OpenText, Weeks Ahead the Mass Layoffs Begin
Is that what IBM will be good at?
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, March 24, 2026
IRC logs for Tuesday, March 24, 2026
Gemini Links 24/03/2026: Junk Drawer Time Capsule and Building Outside Alire
Links for the day
Not Much LLM Slop About "Linux" Lately, It Only Ever Comes From the Same Few Sites
As long as only few such sites use LLM slop we can skip and avoid them
Links 24/03/2026: "Epic Lays Off Over 1000 Employees" and US in Financial Trouble According to the Fed
Links for the day
The "Media" Does Not Only 'Miss' Mass Layoffs
"The Treasury just declared the U.S. insolvent. The media missed it"
2012: 'Secure' (Microsoft-Controlled) Boot Has Not (Yet) Been Made Obligatory. 2026: systemd Has Not Implemented Age Verification
should we stop calling "nazi" everyone we don't agree with?
More Threats (Including Physical Threats) Against Us Are a Dumb Move
It's like a "hit list" (targets list) and I shall keep the police duly informed
New Example of Pentagon in "Feminist" Clothing Inside Fake News of Publishers Paid to Promote Outsourcing to US ("Clown Computing") and American Slop
Google now pays money to promote Google as a friend of women
Hating Techrights is a Career
but is it good for civil society?
The New Layoffs: 'Silent Layoffs', 'Secret Layoffs', 'Quiet Layoffs', 'Passive Layoffs' 'Stealth Layoffs', and Unannounced Layoffs Disguised as Return-to-Office (RTO Mandates)
The US needs to revisit and fix the WARN Act
What Feminism in Science Means (Codes of Conduct Don't Tackle the Real Issues)
Universality matters, more so in a project or community that's said to build the "universal operating system" (Debian)
SLAPP Censorship - Part 21 Out of 200: It's About Behaviour Online, Not How Much Money From Shadowy Third Parties Gets Spent on Lawyers and Two Barristers
75+ KG of legal papers, 2 cases, 2 barristers (one hiding in the metadata) and maybe two law firms (also hiding in the metadata) against two modest people in Manchester seems disproportionate and vindicative
Links 24/03/2026: "Airports on ICE" and "Have You Paid Your “Intuit Tax”?"
Links for the day
Gemini Links 24/03/2026: Slop Interview and Why Slop Makes Lousy Code
Links for the day
Richard Stallman to Give Public Talk This Thursday at the University of Bologna (Italy)
Hardly the first time he speaks in Bologna
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, March 23, 2026
IRC logs for Monday, March 23, 2026
Gemini Links 23/03/2026: "Mandatory" Bad Things and Dangers of Perfection Aspirations
Links for the day
SLAPP Censorship - Part 20 Out of 200: All Roads Lead to Rome and to GAFAM Funding
Now about 10% into this series
Last Week's EPO Strike Was the Biggest (Highest Participation Rate), Hours Ago General Assembly Discussed Next (Growing) Intensity of Strikes
Well done and well attended
Mass Layoffs at HashiCorp, IBM Hid Them
The media did not mention those layoffs
Microsoft Downgraded on Concerns (Lack of Growth) Amid Silent Layoffs in 2026
The press isn't functioning anymore
Links 23/03/2026: Gulf Water at Risk, Heatwave in Malaysia
Links for the day
Slop Means False, New Article by Cybershow
"We are living in a world that is rapidly divesting from reality."
Debianism election 2026 community poll created, everybody can vote
Reprinted with permission from Daniel Pocock
Links 23/03/2026: "Shocking Peter Thiel Antichrist Lectures", Robert Mueller Remembered
Links for the day
The Scandal Bigger Than IBM/Red Hat Layoffs is the de Facto "Media Blackout" About Those Layoffs
So we have a media crisis, aside from the economic crises
Gemini Links 23/03/2026: Geminispace/Elpher Enhancement and the Cerberus Cinco
Links for the day
Fear is Not a Legitimate Factor
Smart people know that trying to prevent moral people from doing the "Right Thing" will backfire
Fuel Autonomy and What It Teaches Us About Software Autonomy (or Software Freedom)
Need we wait until a "software Pearl Harbor" or protect ourselves proactively by weaning ourselves off of GAFAMware?
Scheduled Maintenance This Coming Wednesday
Other than that, all is the same and we carry on as usual
Most Press Articles About IBM Are LLM Slop, Sometimes With Slop Images
IBM basically laid off almost 1,000 people last week [...] At the moment about 75% of the 'articles' we see about IBM (in recent days) are some kind of slop
Links 23/03/2026: Security Breaches, Energy Shortages, Another SRA Scandal, and Patents on Nature
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, March 22, 2026
IRC logs for Sunday, March 22, 2026