2023/XI: Small and nice

Evolution instead of revolution at ArchivistaBox

Egg, November 14, 2023: The world is changing rapidly. And that doesn’t even mean the global political situation, but the IT landscape. Software that is not completely overhauled in two months is currently almost the exception rather than the norm. So ArchivistaBox is about evolution, not revolution. And even if this does not represent a global political view, the ArchivistaBox model maintenance is based on a 5-year “planned economy”. This does not mean that there will be no updates during this time, but the substructure will remain the same during this period. The current 2023/Xi update is therefore about model maintenance. The most important new features are presented here.

Working with the clipboard

Who is not familiar with the shortcut keys Ctrl+a (select), Ctrl+c (copy) and Ctrl+v (paste)? For some time now, ArchivistaBox has offered the option of generating a unique link from the current file in the main view on the ‘View’ tab using Ctrl+c, which can then be referred to. Here is an example:

60933: Birds do it - Mr. Bird geht in die Luft

This makes it very easy to pass on links to refer to documents in the ArchivistaBox. Unfortunately, this previously only worked if there were no special characters in the description. In the new version, descriptions such as the following now also work:

60839: Der Millionenbauer 12 "Schöne Bescherung"

Copying meta keys of documents

If a document is “temporarily copied” in the ‘Edit’ tab of the main view with Shift+F7, the meta information then can be pasted as required with ‘F8’. Here, too, certain special characters did not work or were inserted incorrectly in the desired new document.

New: QR code recognition in page view

The previous QR code recognition requires the desired values to be entered in the fields provided for this purpose. A QR code can now be read directly in the ‘View’ tab and the page view with Ctrl+c. The value is displayed on the screen. The value is displayed on the screen and copied to the clipboard at the same time. This means that these QR codes can be used elegantly, for example, if the data of the QR code is to be passed on easily.

This allows payment information to be inserted directly in e-banking, for example, without having to upload the receipt first or mark and copy the payment data via the page text.

Important: The barcode module is required for QR recognition to work.

ArchivistaDOM now with RAID10 or RAID1/cluster

With the previous ArchivistaMediaVM solutions, raid clusters with 2 hard disks each could be ordered for the DOM variant. Now 4 SSD hard disks are supplied as standard. This allows either a RAID10 or two RAID1 (cluster solutions) to be implemented. This increases reliability accordingly.

Import subtitle text from video files

Numerous videos are delivered with subtitle files. These texts are now available for the full text search. All language variants are included.

Optimization of videos

Since the end of 2019, it has been possible to manage videos with the ArchivistaBox. This allows extensive video databases to be managed very efficiently and conveniently. However, videos are currently being published in ever higher resolution and quality. This sometimes leads to videos being archived in much too high a resolution or taking up too much space. For example, if you download a movie from a streaming service (e.g. teleboy.ch), you will receive approx. 7 to 10 GBytes (for a feature-length movie). And this even with HD quality (1280×720 pixels). Full HD (1920×1080) or 4K (3840×2160) can result in files of well over 20 or more GB.

Apart from the fact that the current limit for files that can be managed with the ArchivistaBox is 64 GByte, files from 1 or 2 GByte are not entirely unproblematic for archiving, even if the ArchivistaBoxes K2 and Everest are used, which can be expanded to up to 200 TByte.

If, for example, the data from teleboy.ch is used directly, a movie comprises approx. 7 GB. With a hard disk capacity of 14 TB (14*1000/7), a total of 2000 videos can be managed. If the files are optimized, the number of films can be managed by a factor of 10 or more. Previously, the parameters had to be entered accordingly in WebConfig. The necessary parameters had to be defined precisely. Now there are utility programs that do the job automatically. To do this, a terminal must be called up on the Archivista desktop. The optimization can be carried out there as follows:

vidopt1 pfadin
vidopt2 pfadin
vidopt3 pfadin

Here, ‘vidopt1’, ‘vidopt2’ and ‘vidopt3’ correspond to the utilities and ‘pfadin’ stands for the path in which the videos are located. In the first step (‘vidop1’), a presumably optimal compression is calculated. All videos are compressed to HD quality with a maximum of 30 frames/second. Video files that are delivered at 50 frames/second, for example, are then available at 25 frames/second, as this results in the lowest possible losses during compression and the rendering (recalculation) of the data is significantly faster than if, for example, the calculation were made from 50 to 30 frames.

In about 2/3 of the videos, well-optimized videos are then available that are significantly smaller. Unfortunately, the last third results in videos that have either been compressed too much or too little. This is why the ‘vidopt2’ utility is available. This program tests the videos created to see whether they have been “shrunk” too little or too much. If this is the case, it is readjusted accordingly.

With ‘vidopt3’ the original videos are moved to the folder ‘/home/archivista/data/vidbackup’. The compressed videos are available under the same name in the corresponding folder and can be archived. The Archivista log file (WebConfig) provides information on how well this worked.

The algorithm used was tested on tens of thousands of videos. For less than one per thousand, the result was not optimal by a few percent. A large proportion of the processed videos had already been optimized with the old version (WebConfig). The data could be “shrunk” again by approx. 2 TB (18 to 16) without the already optimized videos suffering any noticeable loss of quality or because the program automatically calculated the optimum compression instead of manual estimation (which requires a lot of experience).

In addition to the utilities ‘vidopt1’, ‘vidopt2’ and ‘vidopt3’, other little helpers are now available. The first video track is extracted with ‘videoonly in.mp4 out.mp4’. With ‘audioonly in.mp4 out.mp4′ the first audio track is extracted. videocombine vidtrack.mp4 audiotrack.mp4 output.mp4’ can be used to combine separate video and audio files. And ‘subtitle video.mp4 subtitle.vtt output.mp4’ adds a subtitle file to a movie. All these utilities work with the ‘ffmpeg’ utility. However, the corresponding parameters are not always easy to understand. The helpers above should make the work easier.

The question remains, why are these little helpers necessary or useful? If you download videos, you will occasionally find separate video and audio tracks. Without merged files, these videos cannot be played with the usual tools (e.g. even with the standard HTML player of web browsers). The same applies to subtitle files. These are usually stored in a separate file. Here too, only if the subtitle file is embedded in the video can it be played directly with a standard video player.

FreeTube instead of YouTube

The industry leader recently tightened the rules for playing videos there. The background to this action is that the monovist (the primus has a de facto monopoly on video playback on the web) either wants to play more advertising or wants to fleece those who want to watch YouTube videos without advertising interruptions.

A small aside here: the family subscription for Switzerland currently costs 23.90. This seems a little “brutal”, as an estimated 99.x% of videos are uploaded by private individuals. The hosting costs cannot possibly be that high.

Even though the ad blocker supplied with the Archivista desktop has not yet shown that videos can only be played with advertising, alternatives have been evaluated. With FreeTube, an alternative is available which (apart from the fact that it can be started as a desktop application) has many advantages. Firstly, FreeTube requires neither registration nor acceptance of terms and conditions, nor is any personal data passed on directly to the search giant. The relevant information is all stored locally.

Another advantage of FreeTube is that channels can be accessed without registration and the corresponding videos can be downloaded without any problems. Information on why this is the case can be found at https://freetubeapp.io/about.php.

Little helper for Teleboy.ch

The current subscription to Teleboy.ch costs 11.90 (five devices at the same time, i.e. for the whole family). This means that the subscription costs less than half the price of a YouTube Premium subscription. In return, there is unlimited access to around 300 TV channels with rewind of the last 7 days and download option.

So that there are no misunderstandings, the mention here was in no way sponsored here and the offer is limited to Switzerland. There are other offers in other countries, some of which are even cheaper, but these often lack the typical Swiss channels. And in general, the corresponding country-specific offers can only be used by bypassing them using a VPN (e.g. ProtonVPN, also not sponsored here). Of course, the connect information (m3u files) for the relevant channels could be found somewhere, but the time required and the price (11.90 per month) don’t really match up.

Of course, this begs the question: what is the point of linear television at the moment? Without replay and download, of course, little to nothing. With both, however, you can easily watch many current feature films that are otherwise only available in streaming.

Bei teleboy.ch gibt es dazu die Rubriken ‘Tipps aus der Redaktion’. Dort finden sich zwar nicht nur Spielfilme, jedoch ist die Kuratierung (Auswahl) recht gut. Weiter gibt es ‘Beliebteste Filme’. Dort finden sich jene “Brocken”, welche die User “anschauen”. Beiden Rubriken gemeinsam ist, dass die Listen oft übermässig lang sind, weil der gleiche Film zwei- oder mehrmalig ausgestrahlt wird. Ebenso gibt es Filme, die einem auch beim 10 Mal nicht zu begeistern vermögen, gefühlt aber dennoch so alle zwei oder drei Wochen gespielt werden.

Aus diesem Grund gibt es das Helferlein ‘teleboy filme.html’. Damit das Skript arbeitet, müssen die zuvor durchgescrollten Rubriken von Teleboy auf dem ArchivistaBox-Desktop heruntergeladen werden. Ctrl+s und Angabe des Namens ‘filme.html’ geht am schnellsten. Danach das Skript starten ‘teleboy filme.html’. Das Skript arbeitet mit oder ohne einer Archivista-Datenbank. Wird eine alternative Datenbank gewünscht, kann das Skript mit ‘teleboy filme.html dbname’ gestartet werden.

Das Skript wertet sämtliche Filme im Programm aus und überprüft anhand der bisherigen Entscheidungen, ob eine nähere Betrachtung angesagt ist oder nicht:

Dickste Freunde -- 2011 -- (310/342)
Komödie • USA

Dickste Freunde - The Dilemma -- 2011-01-13 -- 1578275
US -- Drama; Kom�die; Hollywood-Film; Witzig; Romantisch; Eigenwillig

Neue Liebe, neues Glück -- 2005 -- (311/342)
Romantik • USA
Add it? (j/n)

Dabei gilt folgendes. Wenn bereits ein Film mit gleichem Namen und Jahr verneint wurde, erscheint er nicht mehr zur aktiven Abfrage. Ebenso ist dies der Fall, wenn sich der Film bereits in der eigenen Sammlung befindet. Wurde nichts gefunden, wird gefragt, ob der Film aufgenommen werden soll. Mit ‘j’ oder ‘y’ wird der aktuelle Treffer im Browser geöffnet und es kann manuell entschieden werden, ob eine Aufnahme angezeigt ist oder nicht.

Remove advertising with LosslessCut

The previous version of LosslessCut was only able to cut movies with exact main frames in MP4 files. As these main images are only found approximately every two to three seconds, this can lead to a short “snippet” being included too much or too little in the “cut” work when removing (e.g. advertising).

The current version of LosslessCut offers an experimental mode to make the cut at the exact position. These parts are taken directly from the main images and the film is re-rendered with ‘ffmpeg’ during the cut (approx. 3 seconds per separation). This allows the films to be recreated in such a way that the end product matches the exact image.

The current programs that cut out advertising are based on the format used by satellite receivers. Providers that offer Internet television with the download of MP4 files lack meta information to remove the advertising based on the encoding. Therefore, the only way to remove advertising from Internet TV is manually. Thanks to LosslessCut, this can still be done relatively quickly because small preview images are displayed which make it very easy to recognize when an advertising block begins or ends.

Updated versions of JDownloader2 + MediathekView

Both programs are used to download video files of almost any type. However, both versions require updates from time to time, because otherwise the download will no longer work from many sites. And that is why the corresponding updated programs can be found in Release 2023/XI.

What’s new with AVMultimedia?

As always, the programs that are available on the ArchivistaBox desktop can also be found in AVMultimedia. Exceptions are ‘vidopt1’, ‘vidopt2’, ‘vidopt3’ and ‘teleboy’, which are based on the ArchivistaBox sources and therefore cannot be run on the AVMultimedia desktop.

Addendum: The current AVMultimedia version can currently only be obtained from https://sourceforge.net/projects/archivista, osdn.net seems to be having problems at the moment, version 2023/XI could not be uploaded today, for example. If something changes at osdn.net, a small note will be posted here. The same procedure also applies to ArchivistaBox when it comes to the optional programs (LossLessCut, FreeTube, MediathekView and JDownloader), the required opt.os (to be stored under /home/data) can currently only be obtained via https://sourceforge.net/projects/archivista.