Bonum Certa Men Certa

Turning Away Unwanted and/or Predatory Bots

posted by Roy Schestowitz on Sep 15, 2024

Sleep Tight

If no human will ever read it, what's the point serving?

ROGUE bots (programs without operators) are ruining the Web. One of us recently contacted Semrush Holdings, Inc. (founded by Oleg Shchegolev and Dmitri Melnikov) to complain about the misbehaving bots, which offer no benefit to anyone and basically just waste bandwidth and burn the planet. Semrush responded, but it's difficult to actually anticipate better behaviour. It's like another bubble; they probably have no concrete plan as a company (Semrush Inc. became Semrush Holdings, Inc. - one can guess why).

Companies like Semrush ruin the Web for real people. They also unnecessarily increase people's hosting bills. To them, that's just an "externality" - like LLM pests, they simply couldn't care less! Companies like these motivated us to go static; they misuse programs with a database back end (e.g. wikis) because they don't behave like people who are sane. They scrape away mercilessly and selfishly. They disregard and bypass caching or HTTP headers.

That's not to say that Gemini Protocol is free of annoying bots; we wrote about some of these before and many still traverse Geminispace for little purpose other than maintaining lists like these:

There are 4056 capsules. We successfully connected recently to 2872 of them.

Those 10,000 pages sent from Techrights were retrieved for no purpose other than Lupa gathering statistics or surveying what's out there. Since midnight today we've served 13344 requests over Gemini, yesterday it was 12917, and the day before that 14257. A high proportion of these are requests from bots.

An associate has adjusted the domain's configurations to send "429 Too Many Requests" to unwanted Web requests that might cause denial of service (at sufficiently high volume). "I think this will be an appropriate tool against bots hitting the server too hard," he said. "Changes were required in NFTables and in the Apache2 configuration," he said, and there's probably no information of use for an attacker here, as merely knowing NFTables is used barely gives an advantage.

But the very fact one needs to deploy and use NFTables means extra complexity. The misbehaving, out-of-control bots have certainly caused many sites to just throw in the towel and shut down.

Making the site and capsule serve pages fast to real visitors is of utmost importance, not gaming the numbers upwards. If you want fakes, go use Facebook; or do what Clickfraud Spamnil (Swapnil Bhartiya) does at YouTube.

In Geminispace, the capsules known to be using the Certificate Authority Let's Encrypt are a dying breed; the total has fallen again. Lupa sees only 31 such capsules today:

2576 (89.7 %) capsules are self-signed, 31 (1.1 %) use the Certificate Authority Let's Encrypt, 265 (9.2 %) are signed by another CA (may be not a trusted one).

So this coming week we might see the Certificate Authority Let's Encrypt at under 1%. It used to be in around 200 capsules or around 12% of Gemini capsules.

Other Recent Techrights' Posts

Links 20/05/2025: Biden's Cancer, GDPR Changes, and UK Defamation Cases (or SLAPPs) Fail Again
Links for the day
Microsofters Targeting the Wife of the Critic of Microsoft
false claims and loaded statement
Microsoft a Top Sponsor at Red Hat Summit (IBM Selling Proprietary Spyware and Back Doors in a "Red" Trench Coat)
They both work for Microsoft
New 'Interview' With - or Talk Coverage of - Richard Stallman in the European Union
automated English translation
Gemini Links 20/05/2025: LLM Scraper Bots in Gopher and "Starmer and the Somewheres"
Links for the day
 
IBM Has Allegedly Just Sacked Mr. McKinsey (McK), Clay Cowan, Its Fourth CMO in a Few Years
To insiders he represented the company that's killing IBM or advising IBM on how to self-destruct
Slopwatch: Slopfarms 'Think' Redis is "Linux" (RedisRaider)
Today we'll keep it short and to the point again
Gemini Links 21/05/2025: Trips, 4D Golf, and Writing Software
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, May 20, 2025
IRC logs for Tuesday, May 20, 2025
Links 20/05/2025: "Bankrupt 23andMe Just Sold Off All Your DNA Data" and "Free Speech Warriors" MIA
Links for the day
Openwashing of Windows, Back Doors, Persistent Surveillance, Keyloggers, Screen Loggers, DRM and So On
WSL is not "Linux", it's Windows
IBM Mass Redundancies Likely This Coming Thursday
We're not in a position to judge if that's true or false
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Monday, May 19, 2025
IRC logs for Monday, May 19, 2025
Skype Fell Off a Cliff (Microsoft Killed It), All Microsoft Has Left Now is Slop and Spaghetti Code
"This isn’t about AI. This is a puppet show to drive stock prices up and down."
The Official SUSE Blog Uses LLM Slop to Compose Fake Articles Promoting Microsoft and Azure
even a little slop spoils the broth
Slopfarms (Machine-Generated Fake News Sites Authored by Bots With Slop Images) Spread GNU FUD
This isn't about Linux (GNU doesn't run just on Linux)
United States Federal Government's Digital Analytics Program (DAP): GNU/Linux Users Represent Close to 6% of Visitors This Year
How far has GNU/Linux gotten? Very far!
The "LLM Ouroboros of Shit" is Complemented by Even Worse Phenomena Caused by Microsoft's Contribution of SPAM and Pollution
Microsoft became a world leader in promotion of LLM slop
The LLM Ouroboros Phenomenon
Fact #1: over time slop gets worse (training set is like some blurry JPEG). Fact #2: People's "smell" for slop improves over time, as they 'train' on slop and can detect it based on prior encounters. Put 1 and 2 together.
Links 19/05/2025: Charges of Blackmailing Over Son Heung-min, Chad Opposition Leader Detained
Links for the day
Gemini Links 19/05/2025: Ableism, Silicon Monkeys, and More
Links for the day
How We Defeated DDoS Attacks
One of the best things one can do is migrate to an SSG
Microsofters Issuing Threats to Microsoft Critics Who Blog About Microsoft
So far we see that their "legal strategy" revolves around trying to discredit people like Theodore Ts'o
Links 19/05/2025: Political Catchup and CISA Advisories
Links for the day
TheLayoff.com Has Begun Deleting Trolls/AstroTurfers Infesting the IBM Section to Discourage On-Topic Discussion About Culls and Maladministration (Bad Strategy)
Moderators have realised there's a problem
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, May 18, 2025
IRC logs for Sunday, May 18, 2025