Bonum Certa Men Certa

Archiving Web Sites to Ensure They Last Decades, Not Years, Outliving or Outlasting Various Disruptive Events

Video download link | md5sum b29da11a5ae25c7597c459e8e4c320b2



Summary: Today we upload 15 years' worth of blog posts to the Internet Archive (IA), or close to 32,000 stories along with Daily Links; we suggest that other sites do the same in order to tackle 'Internet rot' and preserve information (otherwise there's room for obscene revisionism)

THE INTERNET won't stay around forever. The Soviets, back in the old days, tried to develop something similar to it. The Internet will probably survive the next decade or two, but fifty years is a stretch; as for the World Wide Web, it has already devolved into a transport layer for JavaScript and DRM, having been rendered bloated and malicious in practice (albeit not in theory; one can still produce elegant Web sites).



Earlier this year we moved to Gemini and more than a year ago we adopted IPFS, which is used to circulate daily bulletins and IRC logs in a decentralised fashion. Our IRC channels all became self-hosted (in our network) earlier this year -- an ambition that we've had for years but didn't get around to until Freenode collapsed.

Archiving a Web site isn't the same as format changes and protocol changes. It's also not about making more copies, especially if those copies are as vulnerable to censorship as one another. Here in this site we have some public domain (PD) works that are of relevance to us and can be accessed in gemini://. Most of the works, however, use a Creative Commons licence. We are not a curation site per se, but it helps to keep copies of historical material, such as antitrust material demonstrating Microsoft's crimes (as tactics barely change over time). Well, by Internet standards we have enjoyed a long span of 15 years (articles and daily links) and we remain active on the daily basis. The same is true for Tux Machines, which turns 18 this coming summer, so a lot of the material we have here is no longer available anywhere else, except the Internet Archive (IA).

A few years ago we started making site archives in IA and we also recommended the site to people, dubbing it the most important site on the Web. It's no eternal site however; as an associate of ours explains, "the IA is very important but it will succumb as the WWW is phased out in favor of obfuscated, proprietary JavaScript."

IA can barely cope with (e.g. spider/index/save/navigate) many of the "modern" Web. When you add DRM to the mix (EME), then it's not a "format-shifting'" task as that too becomes an impossibility. Sites need to evolve or perish, which may mean getting off the Web and one day planning for the demise of the Internet as a whole. Like IA, our associate explains, "archive.is is interesting, but it'll die one day. In the long run they will all pass away. In formal archives, one of the initial decisions the institution has to make about any given artifact is that of how long it shall be preserved for. Nothing lasts forever, but there are ways of stretching things out and the duration determines the methods of preservation."

For a site such as ours it makes sense to keep the material available for 50 years, which is maybe how much longer I can live (if I'm lucky).

"Media shifting will obviously be involved," the associate notes, "but at a loss for some items. The plan pre-dates AWA by a great many years."

Last weekend we turned 15. "Already in 15 short years," our associate remarks, "many whole sites are gone. And of the sites that remain, many have lost all their old articles in clumsy reorgs. Of that which is left, some of those have purged documents with "inconvenient" messages or themes... even Groklaw purged its comments. I suppose few to none of the Groklaw comments made it into the Library of Congress archives."

At the time of writing I'm still uploading 205 MB of archives (as shown in the video above). We hope it can inspire other sites to think ahead and do the same. It's not a big task and it's better done before it's "too late"...

Our associate concludes by saying that "many programmers and even engineers are conscientious in erasing anything "old" even important records. Now with electronic media, there is often only a single copy of anything any more and that introduces, obviously, a single point of failure. So in the old days, one could maintain a relevant personal or professional archive. Now those are all centralized and continue to exist only at the whim of participant consensus. Anyone with administrative privileges, can "tidy" up and easily erases the world's last copy of a standard or other evidence or similar material."

We are going to add more material to IA and it can be found here as that piles up along with some material that isn't ours.

Recent Techrights' Posts

For the Second Time in a Few Weeks Microsoft Lunduke Makes False Accusations Against Senior Red Hat Staff to Incite a Despicable 'Troll Army'
Nothing that Microsoft Lunduke claims of says can be trusted
su lisa && rm -rf /home/ibm/power
Novell was ruined by another person from IBM, Ronald Hovsepian
A Record Demand at Microsoft: Demand to Cancel
What we're witnessing is a very ungraceful destruction of XBox
Richard Stallman is Going to Finland to Give a Talk Next Thursday
A day later he speaks in Sweden
 
Michael “Monty” Widenius: It Started in 1983 With Richard Stallman (RMS)
The other co-founder of MySQL is a bit notorious for confronting RMS rather viciously
Microsoft is Losing Europe
Hence all the "support" and "discount" offers that are limited to Europe
The Free Software Foundation Starts Fund-raising for 40th Anniversary
New pop-up 2-3 days ahead of the 40th anniversary event
Systemd Breaks Networking in Debian and Microsoft Staff Rushes to Make Face-Saving Excuses in LWN
Microsoft's bluca is already there in the comments, his Microsoft money pays for LWN to let him leave comments early
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Wednesday, October 01, 2025
IRC logs for Wednesday, October 01, 2025
What the End of XBox Will Look Like: a Fiery Crash
XBox is the next Skype. It won't last much longer. Expect many more layoffs.
Gemini Links 02/10/2025: SMTP Pipelining and End of ROOPHLOCH 2025
Links for the day
Slopwatch: Plagiarism, Fake Articles, and FUD About Linux
not a day goes by without Google News feeding FUD from slopfarms
Gemini Links 01/10/2025: Chat Control and End of Life
Links for the day
Links 01/10/2025: Long Covid Risk Reiterated, "Bitcoin Queen" Caught
Links for the day
Links 01/10/2025: EA $55 Billion Deal is Debt and Slop "Raises Vishing Risks"
Links for the day
Bluewashing at Red Hat Means Redundancies
The man who sold Red Hat to IBM meanwhile became a Microsoft Mono booster
After Killing OpenSource.com, IBM ('Red Hat') and OSI Told Us OpenSource.net Would Replace It (But That Didn't Happen)
Now it's time to move on, perhaps tarnishing the "Open Source" label some more (for whatever sponsor wants this)
Linux is Not a Community Project, It's a Wall Street Product
The core goal should be freedom
Bad Actors Abusing the Free Software Community, Vandalising It Using Rogue Politics and Old Tactics
Oil giants have long attempted to do this; now, the digital equivalent of Big Oil does this in technology
Social Control Media Isn't the Future, The Federation or Fediverse Isn't Growing, People's Accounts Vanish for Good
users' accounts will get deleted, not just become inactive
IBM is Failing, This Helps Show Wall Street is Entirely Detached From Actual Commercial Performance
IBM is unable to grow, it's just constantly shrinking
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, September 30, 2025
IRC logs for Tuesday, September 30, 2025
Clerical Aspects of Publishing and Development
In Free software, the management aspects are considerably reduced
Slopwatch: Fake Articles and Google News Promoting "Linux" Spam or Bot-Generated Fear, Uncertainty, Doubt (FUD)
These slopfarms help misplace blame
Third Wave of Microsoft Layoffs in September, This Time Many in Liverpool Affected
Be ready for more waves of layoffs ahead of the so-called "results" in late October
Gemini Links 30/09/2025: Motorcycling in Central Oregon, Protocol Styles and the Flag of Sark
Links for the day
Links 30/09/2025: Death Sentences, Internet Censorship, and Internet Shutdowns
Links for the day
Gemini Links 30/09/2025: Social Control Media and ROOPHLOCH
Links for the day
Richard Stallman About to Give More Talks in Europe, Some Confirmed Already
In Göteborg
Links 30/09/2025: CERN in "Have I Been Pwned" and More Windows TCO Blunders
Links for the day
Microsoft Canonical is Selling Mass Surveillance and Back Doors as "Security for Ubuntu"
If you are looking for a GNU/Linux distro to use, just remember that Microsoft has Ubuntu in the bag
Justice for Wildlife
animals cannot speak to humans who hate animals
Cowboys Gonna Be Cowboys (on the Internet, They're Not a New Problem)
Boys will be boys
Cowboys of the "Left" and Cowboys of the "Right"
Don't believe the lie that this is some "leftist" thing
When Codes of Conduct Serve to Protect Criminals From Much-Deserved Scrutiny
CoCs are typically unfit for purpose because enforcement lacks context and suitable understanding of the full background (the "full story")
It Took the Open Source Initiative (OSI) 4+ Years to Address the 'Data Breach' or Data Protection Violation Reported to the California Privacy Protection Agency (CPPA) in March 2025
We may never know the dialogue or its nature
Even Microsoft's Biggest Boosters (and Media Operatives) Are Turning Against Microsoft
Expect many more layoffs before the fake "results" next month
GNU Was Right 42+ Years Ago
Since then the abusive, user-hostile technology has spread like mushrooms
Old Isn't Always Inadequate
How many gadgets manufactured today (in 2025) will still work in 2075?
The Monkey Business of Rust People
Compatibility won't matter
Almost Half of the FSFE's Money (the Fake 'FSF', Misusing the Brand) Comes From Vodafone
That money always comes with strings, even if they're invisible to most of us
Microsoft Lunduke Spreads Deliberate Lies to Incite Online Mobs
Has he lost his reading comprehension skills?
Our 19th Birthday (in Just Over 5 Weeks From Now)
We meanwhile have ongoing, solid plans to cover patent-related issues when the FSF turns 40
British GNU/Linux Distro FydeOS Tops DistroWatch
That seems like a decent site and decent effort to keep an eye on
We'll Soon Have 75,000 GemText Pages
avoid many perils of today's Web
Google Used Free Software to Build a Monopoly. Now Google Kicks Free Software to the Curb
The "G" in "Google" does not stand for GNU. It never did. It's just another greedy company.
Gemini Links 30/09/2025: Retro Hardware, Federated Fragmentation, and Nex Server Written in C
Links for the day
4 More Days Till "4 decades, 4 freedoms, 4 all users"
We are now just 4 days away from the rare anniversary
Two Months After Merging to Hide GitHub Losses Microsoft is Doing It Again (This Time Windows)
Merging those two together is not a sign of strength but a tightening of budget
Speculations About the Next Large Wave of IBM/Red Hat Layoffs
the mass layoffs are likely to happen on week 3 or 4 in October
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, September 29, 2025
IRC logs for Monday, September 29, 2025