Bonum Certa Men Certa

Archiving Web Sites to Ensure They Last Decades, Not Years, Outliving or Outlasting Various Disruptive Events

Video download link | md5sum b29da11a5ae25c7597c459e8e4c320b2



Summary: Today we upload 15 years' worth of blog posts to the Internet Archive (IA), or close to 32,000 stories along with Daily Links; we suggest that other sites do the same in order to tackle 'Internet rot' and preserve information (otherwise there's room for obscene revisionism)

THE INTERNET won't stay around forever. The Soviets, back in the old days, tried to develop something similar to it. The Internet will probably survive the next decade or two, but fifty years is a stretch; as for the World Wide Web, it has already devolved into a transport layer for JavaScript and DRM, having been rendered bloated and malicious in practice (albeit not in theory; one can still produce elegant Web sites).



Earlier this year we moved to Gemini and more than a year ago we adopted IPFS, which is used to circulate daily bulletins and IRC logs in a decentralised fashion. Our IRC channels all became self-hosted (in our network) earlier this year -- an ambition that we've had for years but didn't get around to until Freenode collapsed.

Archiving a Web site isn't the same as format changes and protocol changes. It's also not about making more copies, especially if those copies are as vulnerable to censorship as one another. Here in this site we have some public domain (PD) works that are of relevance to us and can be accessed in gemini://. Most of the works, however, use a Creative Commons licence. We are not a curation site per se, but it helps to keep copies of historical material, such as antitrust material demonstrating Microsoft's crimes (as tactics barely change over time). Well, by Internet standards we have enjoyed a long span of 15 years (articles and daily links) and we remain active on the daily basis. The same is true for Tux Machines, which turns 18 this coming summer, so a lot of the material we have here is no longer available anywhere else, except the Internet Archive (IA).

A few years ago we started making site archives in IA and we also recommended the site to people, dubbing it the most important site on the Web. It's no eternal site however; as an associate of ours explains, "the IA is very important but it will succumb as the WWW is phased out in favor of obfuscated, proprietary JavaScript."

IA can barely cope with (e.g. spider/index/save/navigate) many of the "modern" Web. When you add DRM to the mix (EME), then it's not a "format-shifting'" task as that too becomes an impossibility. Sites need to evolve or perish, which may mean getting off the Web and one day planning for the demise of the Internet as a whole. Like IA, our associate explains, "archive.is is interesting, but it'll die one day. In the long run they will all pass away. In formal archives, one of the initial decisions the institution has to make about any given artifact is that of how long it shall be preserved for. Nothing lasts forever, but there are ways of stretching things out and the duration determines the methods of preservation."

For a site such as ours it makes sense to keep the material available for 50 years, which is maybe how much longer I can live (if I'm lucky).

"Media shifting will obviously be involved," the associate notes, "but at a loss for some items. The plan pre-dates AWA by a great many years."

Last weekend we turned 15. "Already in 15 short years," our associate remarks, "many whole sites are gone. And of the sites that remain, many have lost all their old articles in clumsy reorgs. Of that which is left, some of those have purged documents with "inconvenient" messages or themes... even Groklaw purged its comments. I suppose few to none of the Groklaw comments made it into the Library of Congress archives."

At the time of writing I'm still uploading 205 MB of archives (as shown in the video above). We hope it can inspire other sites to think ahead and do the same. It's not a big task and it's better done before it's "too late"...

Our associate concludes by saying that "many programmers and even engineers are conscientious in erasing anything "old" even important records. Now with electronic media, there is often only a single copy of anything any more and that introduces, obviously, a single point of failure. So in the old days, one could maintain a relevant personal or professional archive. Now those are all centralized and continue to exist only at the whim of participant consensus. Anyone with administrative privileges, can "tidy" up and easily erases the world's last copy of a standard or other evidence or similar material."

We are going to add more material to IA and it can be found here as that piles up along with some material that isn't ours.

Recent Techrights' Posts

Legal Letters Are Not Postcards
It seems like intimidation, nothing more
 
IAM Magazine is in Effect Dead, It's Now Fused Into Microsoft's Patent Troll (Which It Has Promoted All Along)
Microsoft-connected patent trolls in Europe [...] Now, in his new job, Wild can use his 'expertise' to help guide blackmail/extortion to better harm Europe's industry
A Huge Proportion of 'Articles' in The Register MS Are Actually Paid Spam of the Communist Party of China, Selling Compromised (for Wiretapping) Technology
The Register MS is having a go at becoming a marketing company or "B2B"
Top Officials Have Just Left Microsoft, Layoffs in Anything But Name
Microsoft's debt is very fast-growing
Local Staff Committee The Hague (LSCTH) Meets "Alicante Mafia" at the European Patent Office (EPO)
Report on meeting with VP1 and his team on 21 April 2026
UbuntuPit (ubuntupit.com) Has Deleted Slop Pages, Its Slopfarm Experiment Has Failed (Like Always!)
Turning one's site into a slopfarm is a death knell
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Saturday, May 23, 2026
IRC logs for Saturday, May 23, 2026
The "Next Big" Bonus for IBM's CEO Apparently Comes From American Taxpayers While Veteran IBMers Are PIP'd and RA'd (Laid Off)
the next big thing will be the CEO's bonus
Links 23/05/2026: Starbucks Scraps Disastrous Slopfest, Colbert’s Final ‘Late Show’
Links for the day
Gemini Links 23/05/2026: Poetry, Hobbies, ROOPHLOCH, and More
Links for the day
Government Bailouts Won't be Enough to Save IBM
Bailouts from taxpayers in the US
Links 23/05/2026: Social Media Bans and Demise of Userbase of LLM Chatbots
Links for the day
SLAPP Censorship - Part 85 Out of 200: The United Kingdom's Rating for Press Freedom Has Improved, But We Can Do Even Better
we see the US at #64
Sites Realise That Becoming More Active by Using Bots (LLM Slop) is Self-Destructive
We'll soon (maybe next year) also show that some of the 85+ KG of legal papers sent our way are computer-generated garbage, which might run afoul of some rules
European Patent Office (EPO) Strikes Persist, EPO Management Tries to Give False Impression of "Happy Staff"
EPO is trying to broadcast to the world a totally phony image of itself
Gemini Links 23/05/2026: Patience, LLM Chatbts Being Bad, and Unexpected Computer Surgery
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Friday, May 22, 2026
IRC logs for Friday, May 22, 2026
Links 22/05/2026: Ebola Crisis and Samsung Averts a Walkout With Big Bonuses
Links for the day
The End of FOSSPost (fosspost.org), It Has become an LLM Slopfarm Like FOSSLinux
These sites will never get lucky with slop. These experiments always end badly.
Links 22/05/2026: Inflation Fears and Thailand Tightens Visa Rules for Tourists From Dozens of Nations
Links for the day
EPO Staff Representation Speaks of This Week's Discussion With the EPO's Budget and Finance Committee (BFC) Amid Mass Strikes
The Central Staff Committee's outline (prepared in a rush) or the "flash report"
SLAPP Censorship - Part 84 Out of 200: New Legislation Against SLAPPs on the Way (After We Reached Out to Ministers)
They dealt with the matter individually too, but we won't share this in public, at least not at this time
The Corrupt Lecture the Non-Corrupt - Part XXX - Where Was "The Ethics and Compliance Team" When the Family of EPO President Campinos Was Caught Doing Cocaine?
It remains to be seen if national delegates will tolerate this in future meetings
Gemini Links 22/05/2026: Esperanto Music History, Suspicious Adoption of Signal, and Unauthorised LLM Slop in Code
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Thursday, May 21, 2026
IRC logs for Thursday, May 21, 2026