Bonum Certa Men Certa

Archiving Web Sites to Ensure They Last Decades, Not Years, Outliving or Outlasting Various Disruptive Events

Video download link | md5sum b29da11a5ae25c7597c459e8e4c320b2



Summary: Today we upload 15 years' worth of blog posts to the Internet Archive (IA), or close to 32,000 stories along with Daily Links; we suggest that other sites do the same in order to tackle 'Internet rot' and preserve information (otherwise there's room for obscene revisionism)

THE INTERNET won't stay around forever. The Soviets, back in the old days, tried to develop something similar to it. The Internet will probably survive the next decade or two, but fifty years is a stretch; as for the World Wide Web, it has already devolved into a transport layer for JavaScript and DRM, having been rendered bloated and malicious in practice (albeit not in theory; one can still produce elegant Web sites).



Earlier this year we moved to Gemini and more than a year ago we adopted IPFS, which is used to circulate daily bulletins and IRC logs in a decentralised fashion. Our IRC channels all became self-hosted (in our network) earlier this year -- an ambition that we've had for years but didn't get around to until Freenode collapsed.

Archiving a Web site isn't the same as format changes and protocol changes. It's also not about making more copies, especially if those copies are as vulnerable to censorship as one another. Here in this site we have some public domain (PD) works that are of relevance to us and can be accessed in gemini://. Most of the works, however, use a Creative Commons licence. We are not a curation site per se, but it helps to keep copies of historical material, such as antitrust material demonstrating Microsoft's crimes (as tactics barely change over time). Well, by Internet standards we have enjoyed a long span of 15 years (articles and daily links) and we remain active on the daily basis. The same is true for Tux Machines, which turns 18 this coming summer, so a lot of the material we have here is no longer available anywhere else, except the Internet Archive (IA).

A few years ago we started making site archives in IA and we also recommended the site to people, dubbing it the most important site on the Web. It's no eternal site however; as an associate of ours explains, "the IA is very important but it will succumb as the WWW is phased out in favor of obfuscated, proprietary JavaScript."

IA can barely cope with (e.g. spider/index/save/navigate) many of the "modern" Web. When you add DRM to the mix (EME), then it's not a "format-shifting'" task as that too becomes an impossibility. Sites need to evolve or perish, which may mean getting off the Web and one day planning for the demise of the Internet as a whole. Like IA, our associate explains, "archive.is is interesting, but it'll die one day. In the long run they will all pass away. In formal archives, one of the initial decisions the institution has to make about any given artifact is that of how long it shall be preserved for. Nothing lasts forever, but there are ways of stretching things out and the duration determines the methods of preservation."

For a site such as ours it makes sense to keep the material available for 50 years, which is maybe how much longer I can live (if I'm lucky).

"Media shifting will obviously be involved," the associate notes, "but at a loss for some items. The plan pre-dates AWA by a great many years."

Last weekend we turned 15. "Already in 15 short years," our associate remarks, "many whole sites are gone. And of the sites that remain, many have lost all their old articles in clumsy reorgs. Of that which is left, some of those have purged documents with "inconvenient" messages or themes... even Groklaw purged its comments. I suppose few to none of the Groklaw comments made it into the Library of Congress archives."

At the time of writing I'm still uploading 205 MB of archives (as shown in the video above). We hope it can inspire other sites to think ahead and do the same. It's not a big task and it's better done before it's "too late"...

Our associate concludes by saying that "many programmers and even engineers are conscientious in erasing anything "old" even important records. Now with electronic media, there is often only a single copy of anything any more and that introduces, obviously, a single point of failure. So in the old days, one could maintain a relevant personal or professional archive. Now those are all centralized and continue to exist only at the whim of participant consensus. Anyone with administrative privileges, can "tidy" up and easily erases the world's last copy of a standard or other evidence or similar material."

We are going to add more material to IA and it can be found here as that piles up along with some material that isn't ours.

Recent Techrights' Posts

Social Control Media Relies on Advertisers, So It'll Always Be Hostile Towards Free Software
Sales, sales, sales
Fragmentation of Data
Life is too short to "hoard" data
Jamie Zawinski Complained About Wayland, Then Decided to Give It a Go, Now Complains Again About Wayland
Ask IBM (Red Hat) why it's worth throwing so much away just for Wayland fanaticism
Russia Set to Ban Facebook?
If WhatsApp is made to "leave", that means Facebook or "Meta".
Taking Stock of a Good and Productive Week
We shall now be taking a break, unpacking the new hard drive (8 TB), and making backups of everything
 
Europe's Second-Largest Institution (EPO) and Largest Patent Monopoly Office Needs More Transparency, Not Less Transparency
In the EPO, what good are elections when one candidate literally bribes all the voters?
How Not to Report News About Microsoft
This pattern of misreporting is so widespread that it's hard to believe it's not intentional
Computer Science is Under Attack, They Want Everyone to be a Consumer
If people can no longer acquire Computer Science education and real Computer Science experience, they will not know how to control their own digital destiny or emancipate the very same universities that now control the syllabus and instead of teaching Computer Science encourage the outsourcing of systems
The Best Tools Are the Simplest Tools
There's a hidden message here about the merits of sticking with X
Ofcom Online Safety Group Speaks of Protecting Women Online, Will Brett Wilson LLP Ever Listen?
They've essentially became like the Taliban's "burka police"
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, July 20, 2025
IRC logs for Sunday, July 20, 2025
In Defence of "Spinning Rust"
Just because something is "old" (or older) doesn't mean it ought to become extinct
Using Free Software to Prepare Legal Documents
LibreOffice is openly complaining about OOXML as an obstacle
Tech and Technology Are Not the Same Anymore
"Are you into tech, Sir?"
Our Articles About SLAPPs Receive Recognition and Interest
This week we shall continue writing about the 3 lawsuits we filed
Are You Served?
For many people, advocacy of Free software and GPL enforcement are assumed to be happening
Conspiracy or grooming? Alex Jurado, Voice of Reason compared to Outreachy
Reprinted with permission from Daniel Pocock
Links 20/07/2025: Security Breaches and Former 'Open' 'AI' Engineer on Hype and Culture Issues
Links for the day
Links 20/07/2025: Fending Off BRICS and US Government Attacks Its Own Media (Like China and Russia)
Links for the day
Framed by social control media: Alex Belfield, Voice of Reason
Reprinted with permission from Daniel Pocock
Gemini Links 20/07/2025: Summertime and OCC25 Wrap-up
Links for the day
Slopwatch: Planet Ubuntu, LinuxSecurity, and More
former "Linux" blogs which basically became slopfarms
Links 20/07/2025: More GAFAM Lawsuits, Layoffs, and SLAPPs
Links for the day
Nice Recovery (From Actual Fire) by PCLinuxOS, New Version of PCLinuxOS Released, Now Top of DistoWatch
PCLinuxOS is a community-driven distro
More Microsoft Shutdowns That Mostly Slipped Under the Radar
Remember what happened to books 'sold' by Microsoft?
Microsoft Lunduke Still Fighting Cancel Culture With... Cancel Culture
There will be no "winners" in such 'debates'
The History of Daily Links and Politics
"I support Wayland, but I also support abortion..."
Ageism in Tech
Your protocol is "old"...
Microsoft is at 0% "Market Share" in Most Areas
Depending on the taxonomy chosen, there may be dozens of categories other than desktops and laptops
"The moment MSFT stock fails to start tumbling, that’s the beginning of another corporate giant going under."
There are far more layoffs at Microsoft than at Intel, but you would not get this impression based on Wall Street media
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Saturday, July 19, 2025
IRC logs for Saturday, July 19, 2025
Gemini Links 19/07/2025: Git For Authors and Filtered Antenna
Links for the day
UEFI 'Secure' Boot Abuses by Microsoft to be Brought Up in the UK High Court in 3 Months
we'll seek compensation
Next Year It'll Be Half a Decade Since the Fall of Freenode (and IRC is Still Doing OK)
Our IRC network is still accessible using the exact same software that ran in Windows 3.x
Lupa Will Soon Know of 3,100+ Active Gemini Capsules
And some people in the "Small Web" try to tell us that Gemini is dying?
The Slopfarms Are Taking Real News Articles and Replacing Them With Lies Generated by Machines
Bluntly speaking, Fagioli is nothing short of an online scammer
Links 19/07/2025: Techtarget to Cull 10% of Staff, New Threats to Free Press in the US (Home of Dangerous and Violent Stranglers From Microsoft)
Links for the day
Gemini Links 19/07/2025: "Climate Justice” and Forking Programs
Links for the day
What Wayland and Microsoft/IBM systemd Have in Common
focus on what IBM (Red Hat) is pushing while running over critics.
Linux Already Has About 60% of the "Market"
"When mentioning the client side," opines an associate, "it is essential to recite the list of other markets where Microsoft is negligible or a no-show. It is repetitive to do so, but it needs saying -- often."
In Norway, Android/Linux Has Just Hit All-Time High (First Time Since 2020), GNU/Linux Already Very Prevalent
Despite its small population size, Norway gave us Qt and many other things
Finland (and NATO) Must Move to GNU/Linux and Dump Microsoft Even Faster
"Microsoft is not a technology problem, it is a staffing problem."
Microsoft's Mass Layoffs Very Wide-Ranging, Media Focused on Gaming Though Microsoft Mass-Firing Lawyers and "AI" Staff (Contradicting Its Supposed "Investment" in "AI")
Microsoft plans to fire almost half a thousand people in legal roles
2012 Article About the Free Software Foundation Blasting Canonical/Ubuntu Over Adoption of "Secure" Boot (Microsoft's Remote Control Over GNU/Linux Since PCs' Power-on)
By Katherine Noyes (article has since then became 404, not found)
The Microsofters We Sued Helped Microsoft Make GNU/Linux 'Expire' This Year
"Linux and Secure Boot certificate expiration"
linuxconfig.org Joins linuxtechlab.com and Others, Becomes a Slopfarm With Fake Linux 'Articles' (LLM Slop)
They contain "linux" in their domain names, but they are just slopfarms
Links 19/07/2025: Microsoft Cuts in China and Wall Street Journal Sued for Reporting on Jeffrey Epstein
Links for the day
Debian Can Dump Blind Users Because I am Not Blind
the sort of mentality we're up against
Fascistic Policies Got 'Normalised' in 'Public Office'. Let's Not Let the Same Happen in 'Tech'.
Political discourse typically guides what's "normal" and what "good citizens" should believe/feel
The European Patent Office Cannot Attract Proficient Patent Examiners Who Master Their Domain
They are enablers and facilitators of corruption
Yes, Your Mastodon Instance Will Also Shut Down
Few people run a one-person instance in the Fediverse
The Demise of GAFAM Necessitates Greater and Broader Awareness
Morale at Microsoft is really bad
Free Software Foundation Reaches 75% of Funding Goal
Not bad for this "Fosschild"
Slopwatch: 7 New Examples of Fake 'Linux' Slop Pieces (Plagiarism With Misinformation)
Serial Sloppers need to be shunned
Links 19/07/2025: Kapo-berg Settles, Software Patents Challenged
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Friday, July 18, 2025
IRC logs for Friday, July 18, 2025