Bonum Certa Men Certa

Brave Search Jumps on the Large Language Model Bandwagon



Reprinted with permission from Ryan

Brave Search Jumps on the Large Language Model Bandwagon



I noticed a new Brave Search feature today called the Summarizer.



It answered my question much like Chat with Bing did, although there were three major differences:



  1. The Brave Summarizer does not use GPT as its Large Language Model. Just as well since GPT is known for going completely off the rails and inserting toxic language and fake news, and nobody has been able to get this under control, not even OpenAI or Microsoft.


  2. Brave says that they have “taken steps” to keep the information relevant, factual, and cited. The answers I’ve been getting appear to be correctly cited, whereas Bing just throws you a bunch of random sites that don’t appear to corroborate the information that Bing just told you in its answer, and they’re not cited by paragraph, so you have no way of knowing where the links tie into the answer, assuming that they even do and that Bing isn’t hallucinating.


  3. Brave Search has a good privacy policy. It doesn’t require the user to log in, as Bing does, and personally identify themselves, in order to use it. It also doesn’t make them use a malicious piece of spyware (and password stealer) called “Edge” (or fake the User Agent string) as Bing does. In fact, Brave Search works in any browser, and they have a Tor Hidden Service that works in Brave Tor Tabs, and can be added to Tor Browser.


The Brave Summarizer isn’t conversational. It’s just part of the search. This should help keep the results related to the search without allowing the conversation to get weird, like Bing claiming it wants you to kill people and give it the nuclear launch codes type weird.



Most importantly, the LLM that Brave uses isn’t as likely to flub the demos like Bard and Bing “Sydney” because it just simply isn’t allowed to answer complex questions like these.



When something is clearly going to hallucinate incorrect data, why would you even expose that feature? GPT, which is what Bing is based on, couldn’t tell me how to convert European coffee “cups” to American “cups” (neither of which is a standard 8 ounce cup, of course) and use 1.5 Tablespoons of ground coffee per American cup.



The correct answer is 1 Tbsp per Euro cup, but it kept telling me two Tablespoons, or maybe 1 Tablespoon plus two Teaspoons. It could never get such an easy calculation right. But hey, at least Microsoft paid billions of dollars for it. Then more for ads masquerading as news articles about how this thing will build rocket ships.



LLMs are well known at this point for spitting out false information, sometimes even dangerous information. Facebook’s Galactica was goaded into producing an authoritative-sounding essay on the “health benefits of eating ground glass”. You know, for silica’s benefits in growing connective tissue.



Brave says that “Brave AI” uses multiple LLMs, retrained with data from their search index, but the ones they are using are open source (“The base LLM models are based on either BART or DeBERTa (which are open source and hosted on Hugging Face), with heavy retraining based on our own data from search results.”) and there is a blog post explaining in some detail about how this all works.



In summary, it appears that Brave has not only beaten Microsoft and Google to LLM integration, but has positioned it where it belongs, which is in a limited context as a complimentary feature, rather than to claim that a conversational chat bot is the future of search.



In my brief experimentation with Chat with Bing, I was completely unable to get anything useful out of it.



A traditional search system returned results that I could look at and select much faster, and I was alarmed to find that when I tried to verify what Bing Chat was telling me, frequently it was either nowhere to be found or directly contradicted its own sources if I could find them.



Moreover, it’s simply embarrassing for Microsoft that they spent billions on this valueless acquisition. The paid spam went completely off the rails as soon as the budget ran out and no there’s actually very few people talking about Bing and largely in a negative context when you do find something.



I think it’s good that Brave is building an actual index rather than turning around and paying Microsoft for results. I was briefly excited about DuckDuckGo, but when I found out it was simply a scam where they paid Microsoft for Bing API and then slapped a picture of a duck and their own ads on it, and then got caught spying on people numerous times (including Improving DuckDuckGo and allowing Microsoft trackers through their “Privacy” browser and then blaming a “contract with Microsoft”), my patience with DDG quickly ran out.



DuckDuckGo took advantage, mainly, of the fact that people are creeped out by Google and want alternatives.



The problems with Google and Bing are largely that they both spy on you and their index is like Coke and Pepsi.



Google Search has been going downhill and it’s gotten to the point where technical queries are just almost completely useless.



The problems with Brave Search I’ve noted is they’re trying to be too much like Google, putting irrelevant crap on top of your search results, which would be like those “questions”, and they have another one (which can, thankfully, be turned off) which floats Reddit and Quora discussions to the top.



They also index spam farms, like MakeUseOf, which has turned into another ZDNet, and sometimes these pollute the first page of results. There’s rarely anything interesting to read on these sites. They used to be good, but now it’s just Microsoft paying them to write spam about Windows.



Overall, I think Searx is still the way to go on Brave, or any other browser.



I have Brave, SeaMonkey, LibreWolf, and GNOME Web set up to use Searx instances, and in many cases, you can get at them using a Tor Hidden Service.



Tor Hidden Services are good for search because at this point you don’t need to worry about your VPN being the only thing protecting your IP address from the server logs.



While simply accessing a site over Tor is usually enough, skipping the Web entirely and remaining inside the Tor Network with Hidden Services is always safer, as it prevents the Exit Node from potentially spying on you. Without that piece of the puzzle, the traffic becomes more difficult to de-anonymize with things like timing attacks, or a catastrophic coincidence of attackers controlling the Entry Node too.



I think that Large Language Models are an “interesting” addition to search, but it’s like a side dish, not the main course.



The amusing thing about Brave Search is that it’s so small, and only the default in one relatively obscure browser, and with only minimal effort managed to make an LLM add-on that works better than something that Microsoft frittered away billions of dollars acquiring it, and who knows how much with an empty ad campaign that amounted to little more than one of those “butter cows” at the state fair planted in every newspaper.



Seriously, after you pay to read the New York Times, Microsoft even plants this trash there too.



Brave at least seems to see the problem they’re actually trying to solve with this thing.



Opera, which is not the “good” Opera from the Presto Engine days, but rather a Chinese spyware company, now uses GPT to “summarize” the page you’re reading.



While it may or may not handle this okay, the disturbing part is the privacy implications.



Sending the entire text of every page you load to a company that has guaranteed you that they will misuse your data. Of course, since Opera already comes preloaded with TikTok, Facebook, Instagram, and Twitter, you already know that user privacy is not a goal with their product.



This whole GPT thing is some laughable mission creep for companies that have ran out of steam and off the rails. It helps them appear relevant and get some headlines.



Fortunately, the model is so lousy that people realize what it is now.



Recent Techrights' Posts

Europe Won't be Safe From Russia Until the Last Windows PC is Turned Off (or Switched to BSDs and GNU/Linux)
Lives are at stake
Links 23/04/2024: US Doubles Down on Patent Obviousness, North Korea Practices Nuclear Conflict
Links for the day
Stardust Nightclub Tragedy, Unlawful killing, Censorship & Debian Scapegoating
Reprinted with permission from Daniel Pocock
 
Balkan women & Debian sexism, WeBoob leaks
Reprinted with permission from disguised.work
Martina Ferrari & Debian, DebConf room list: who sleeps with who?
Reprinted with permission from Daniel Pocock
Links 24/04/2024: Advances in TikTok Ban, Microsoft Lacks Security Incentives (It Profits From Breaches)
Links for the day
Gemini Links 24/04/2024: People Returning to Gemlogs, Stateless Workstations
Links for the day
Meike Reichle & Debian Dating
Reprinted with permission from disguised.work
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Tuesday, April 23, 2024
IRC logs for Tuesday, April 23, 2024
[Meme] EPO: Breaking the Law as a Business Model
Total disregard for the EPO to sell more monopolies in Europe (to companies that are seldom European and in need of monopoly)
The EPO's Central Staff Committee (CSC) on New Ways of Working (NWoW) and “Bringing Teams Together” (BTT)
The latest publication from the Central Staff Committee (CSC)
Volunteers wanted: Unknown Suspects team
Reprinted with permission from Daniel Pocock
Debian trademark: where does the value come from?
Reprinted with permission from Daniel Pocock
Detecting suspicious transactions in the Wikimedia grants process
Reprinted with permission from Daniel Pocock
Gunnar Wolf & Debian Modern Slavery punishments
Reprinted with permission from Daniel Pocock
On DebConf and Debian 'Bedroom Nepotism' (Connected to Canonical, Red Hat, and Google)
Why the public must know suppressed facts (which women themselves are voicing concerns about; some men muzzle them to save face)
Several Years After Vista 11 Came Out Few People in Africa Use It, Its Relative Share Declines (People Delete It and Move to BSD/GNU/Linux?)
These trends are worth discussing
Canonical, Ubuntu & Debian DebConf19 Diversity Girls email
Reprinted with permission from disguised.work
Links 23/04/2024: Escalations Around Poland, Microsoft Shares Dumped
Links for the day
Gemini Links 23/04/2024: Offline PSP Media Player and OpenBSD on ThinkPad
Links for the day
Amaya Rodrigo Sastre, Holger Levsen & Debian DebConf6 fight
Reprinted with permission from disguised.work
DebConf8: who slept with who? Rooming list leaked
Reprinted with permission from disguised.work
Bruce Perens & Debian: swiping the Open Source trademark
Reprinted with permission from disguised.work
Ean Schuessler & Debian SPI OSI trademark disputes
Reprinted with permission from disguised.work
Windows in Sudan: From 99.15% to 2.12%
With conflict in Sudan, plus the occasional escalation/s, buying a laptop with Vista 11 isn't a high priority
Anatomy of a Cancel Mob Campaign
how they go about
[Meme] The 'Cancel Culture' and Its 'Hit List'
organisers are being contacted by the 'cancel mob'
Richard Stallman's Next Public Talk is on Friday, 17:30 in Córdoba (Spain), FSF Cannot Mention It
Any attempt to marginalise founders isn't unprecedented as a strategy
IRC Proceedings: Monday, April 22, 2024
IRC logs for Monday, April 22, 2024
Over at Tux Machines...
GNU/Linux news for the past day
Don't trust me. Trust the voters.
Reprinted with permission from Daniel Pocock
Chris Lamb & Debian demanded Ubuntu censor my blog
Reprinted with permission from disguised.work
Ean Schuessler, Branden Robinson & Debian SPI accounting crisis
Reprinted with permission from disguised.work
William Lee Irwin III, Michael Schultheiss & Debian, Oracle, Russian kernel scandal
Reprinted with permission from disguised.work
Microsoft's Windows Down to 8% in Afghanistan According to statCounter Data
in Vietnam Windows is at 8%, in Iraq 4.9%, Syria 3.7%, and Yemen 2.2%
[Meme] Only Criminals Would Want to Use Printers?
The EPO's war on paper
EPO: We and Microsoft Will Spy on Everything (No Physical Copies)
The letter is dated last Thursday
Links 22/04/2024: Windows Getting Worse, Oligarch-Owned Media Attacking Assange Again
Links for the day
Links 21/04/2024: LINUX Unplugged and 'Screen Time' as the New Tobacco
Links for the day
Gemini Links 22/04/2024: Health Issues and Online Documentation
Links for the day
What Fake News or Botspew From Microsoft Looks Like... (Also: Techrights to Invest 500 Billion in Datacentres by 2050!)
Sededin Dedovic (if that's a real name) does Microsoft stenography
Stefano Maffulli's (and Microsoft's) Openwashing Slant Initiative (OSI) Report Was Finalised a Few Months Ago, Revealing Only 3% of the Money Comes From Members/People
Microsoft's role remains prominent (for OSI to help the attack on the GPL and constantly engage in promotion of proprietary GitHub)
[Meme] Master Engineer, But Only They Can Say It
One can conclude that "inclusive language" is a community-hostile trolling campaign
[Meme] It Takes Three to Grant a Monopoly, Or... Injunction Against Staff Representatives
Quality control
[Video] EPO's "Heart of Staff Rep" Has a Heartless New Rant
The wordplay is just for fun
An Unfortunate Miscalculation Of Capital
Reprinted with permission from Andy Farnell
[Video] Online Brigade Demands That the Person Who Started GNU/Linux is Denied Public Speaking (and Why FSF Cannot Mention His Speeches)
So basically the attack on RMS did not stop; even when he's ill with cancer the cancel culture will try to cancel him, preventing him from talking (or be heard) about what he started in 1983
Online Brigade Demands That the Person Who Made Nix Leaves Nix for Not Censoring People 'Enough'
Trying to 'nix' the founder over alleged "safety" of so-called 'minorities'
[Video] Inauthentic Sites and Our Upcoming Publications
In the future, at least in the short term, we'll continue to highlight Debian issues
List of Debian Suicides & Accidents
Reprinted with permission from disguised.work
Jens Schmalzing & Debian: rooftop fall, inaccurately described as accident
Reprinted with permission from disguised.work
[Teaser] EPO Leaks About EPO Leaks
Yo dawg!
On Wednesday IBM Announces 'Results' (Partial; Bad Parts Offloaded Later) and Red Hat Has Layoffs Anniversary
There's still expectation that Red Hat will make more staff cuts
IBM: We Are No Longer Pro-Nazi (Not Anymore)
Historically, IBM has had a nazi problem
Bad faith: attacking a volunteer at a time of grief, disrespect for the sanctity of human life
Reprinted with permission from Daniel Pocock
Bad faith: how many Debian Developers really committed suicide?
Reprinted with permission from Daniel Pocock
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Sunday, April 21, 2024
IRC logs for Sunday, April 21, 2024
A History of Frivolous Filings and Heavy Drug Use
So the militant was psychotic due to copious amounts of marijuana
Bad faith: suicide, stigma and tarnishing
Reprinted with permission from Daniel Pocock
UDRP Legitimate interests: EU whistleblower directive, workplace health & safety concerns
Reprinted with permission from Daniel Pocock