Integration of SearX

Red card for the search giant

Egg, September 14, 2020: Our Linux distribution AVMultimedia, which runs in main memory (RAM), has already been widely reported. Beside many advantages (e.g. fast starting of programs) a restart of the system leads to the fact that all stored cookies are deleted. Unfortunately, this does not suit everyone, and this is what this blog is about. If this is too long for you, the following video (German with English subtitles), shortened to about two minutes, also contains the essential content. Have fun!

When Google refuses any work

For a long time now, the search giant has been asking for consent to cookies when its homepage is called up for the first time. These are small fragments of data in your own browser that are usually used to record a unique identity of the web browser and thus of the user. The cookie serves only as a tracking number. The information of interest for the evaluation of search queries and surfing behaviour is stored on the pages of the providers, in this case Google.

Up to now it has been possible to grant this consent only after a certain amount of use. When working with AVMultimedia, this was usually sufficient to use the search engine sporadically and very selectively for a few days or to call up linked YouTube videos. The default search engine at AVMultimedia refers to DuckDuckGo. No cookies are stored. And yes, of course other providers can also be used. These include the French search engine qwant.com and the Swiss solution swisscows.ch. Both deliver good hits.

Unfortunately, the search giant has been requiring consent for a few days now, just to be able to access the site at all. Who would like to receive thereby e.g. no personal advertisement faded in, must click itself laboriously by many sides, the introductory video illustrates this very well. In short, for a few occasional YouTube videos, which can hardly be avoided while surfing, the click effort is becoming unbearable.

SearX, the decentralized alternative

In the search for a solution to avoid these “click orgies”, the decentralized open source meta search engine SearX came into play. Decentralized means there are many instances of SearX, a list can be found at searx.me. There you can choose a provider from a list of over 100 servers to work with the search engine.

Now SearX is Open Source, i.e. the source code is disclosed, and so it was tried to install SearX locally. Under Linux you can use the command ‘pip3 install searx’. Afterwards ‘searx-run’ must be started. Wow, so easy you have a local instance of SearX on your own Linux computer.

At this point the term meta search engine must be clarified. If SearX would create own search directories, a lot of memory and time would have to be spent to maintain them. Meta search engines like SearX do not create their own directories, but rather search through third parties (e.g. Bing, DuckDuckGo and many more, even Google is part of it).

SearX locally at AVMultimedia and the ArchivistaBox

At first glance, the benefits of SearX seem marginal. What should a service that searches in other search engines and presents results do? However, once you work with SearX, you will quickly take SearX to your heart. This is mainly because SearX is very simple and presents the results very clearly. Two points that seem to be rather foreign words in many search engines today.

SearX does the job so well that immediately and ultimately the decision was made to give AVMultimedia a local SearX instance. No sooner said than done, from today’s release onwards SearX can be called up on AVMultimedia as on the ArchivistaBox with ‘localhost:8’. SearX can also be called up directly by entering a search query directly in the address line of the Vivaldi browser.

With the ArchivistaBox there is also the option of operating SearX in the Intranet or deactivating SearX. To do this there is a new entry ‘Configure SearX‘ on the desktop under ‘ArchivistaSetup’. If SearX is operated as a web service with the ArchivistaBox then the query can be made directly with the IP address or the DNS name (if available) and the addition /search or /searx. Example: http://192.168.0.177/search

Compare with SearX search engines

For AVMultimedia and the ArchivistaBox the default settings are DuckDuckGo and Wikipedia. Using settings and search engines it can be very efficiently determined which services should be used. Several providers can be linked with one another. For example, if you want to work with Bing and DuckDuckGo at the same time, no problem.

It is just as easy to find out how well the different search engines work. And there were surprises. The search primacy does not always come off best. Ultimately, the giant probably spits out primarily those homepages that have best implemented the optimization of keywords.

Why does Google fail with ‘Island once around the island’?

A concrete example will illustrate the problem. To begin with, the managing director’s family circumnavigated the island of Iceland for three weeks in summer. This has resulted in a not so small film. Of course, one may argue about the content and quality of the work, but if you want to form your own opinion, you can find the site at https://azurgo.ch/aktuell/island-einmal-um-die-insel (film with 72 minutes length, with English subtitles).

It got exciting when the Iceland film was recommended on a third party site and listed on Google, but not the site with the Iceland film itself. Other providers have long since listed the page with the same query on their first page.

It is also enlightening when the “Search Master” refines the query with video. Many videos appear, which often have not much in common with ‘Island, once around the island’. Most of the time it is about promoting (mostly expensive) travel. Often there are offers about Iceland, but not seldom there are also offers all over the world. It is possible to find interesting content, but there is a lot of advertising and a high click effort.

It’s amazing, small providers like DuckDuckGo, swisscows.ch or qwant.com deliver amazingly good results. Why is this often less often the case with the search giant? Couldn’t it be that, with a 90% market share, it is simply too lucrative to keep searchers on the go a little longer than necessary? What is the benefit for Google if good hits are followed by fast clicks?

Breaking the power of conformity

Far more drastic is the power of the giant for those who offer content. Undisclosed mechanisms of a provider decide to 90% on being or not being. Although the search giant offers to prepare the content specifically, this is not without fault.

If it is only a matter of making something search engine friendly for a provider, the content falls by the wayside. In the meantime, this has gone so far that tools for website solutions remind users to write a few more sentences, because otherwise the content would not be listed.

The more specific the content is prepared for a provider, the less diversity can arise. And therefore it is time to show the search giant the red card. For this homepage, this means that the contents are no longer prepared “google-compliant”. For AVMultimedia and the ArchivistaBox this means that with the new local search engine SearX a new age of searching on the web has begun.

« Prize money 1000 EuroArchivistaBox 2020/X with 200 TByte »