Video Archiving

New version 2022/XI — Tips for preserving movies

Egg, 23 November 2022: The new version 2022/XI first of all brings support for subtitles in the integrated video player. Furthermore, films with multiple audio tracks can be temporarily reduced to the desired audio track in order to play a film with the desired language directly in ArchivistaDMS. In addition to these two innovations, however, this blog is primarily concerned with how video files can or should be archived.

Subtitles in the integrated video player of ArchivistaDMS

Regardless of whether we are talking about films that you have created yourself or videos from the net, these often contain subtitles. Naturally, these files have been or will be archived correctly in ArchivistaDMS. This is because multimedia files are not changed when they are added to ArchivistaDMS.

In order to play video files with subtitles, it was previously necessary to transfer these files to the local computer via the ‘File’ link and play them with a player that supports subtitles. MPV and VLC are available for this on the ArchivistaBox desktop or at AVMultimedia.

As of version 2022/XI, subtitles (VTT/SRT formats) can be managed directly in the internal video player. If subtitles exist, the corresponding option [cc] is displayed at the bottom right of the video player in the control elements. This option can be used to activate or deactivate the appropriate subtitles.

Multiple audio tracks in videos

If movies contain multiple audio tracks (usually language versions), the ‘Audio’ option is displayed below the video player. This allows you to select the desired language. With ‘Submit’ a temporary video file with the desired language track is created on the ArchivistaBox and activated directly in the video player for playback.

This process takes a few seconds depending on the size of the video file, as a temporary copy of the film file must be created on the ArchivistaBox. For this, no additional video player is necessary in ArchivistaDMS and thus remains 100% standard-compliant with the HTML5 standard.

Quibblingly, it could be noted here that the HTM5 standard is of course somewhat limited if it is not possible with the HTM5 video player alone to handle multiple audio tracks within a video. However, this is offset by the fact that video files with multiple audio tracks become correspondingly larger. Therefore, streaming hosts often ship the video and audio tracks separately.

“Optimized” video files for archiving

In everyday work, separate files for video and audio have not become established, but it is quite common to save several language versions in one file. This also applies to subtitles. These are also usually supplied with the video file. With current data, the subtitles are usually in text form and can be displayed or deactivated with the integrated video player of the ArchivistaBox (see above).

With older films, however, the subtitles are available as image data. The ArchivistaBox HTML5 video player is still unable to display these. Consequently, in order to call up the subtitles for these videos, these films must be opened with a desktop video player (file link in the main view of the ArchivistaDMS table).

In MPV, the tracks for language version and subtitles can be found at the bottom right.

The above example shows the Hebrew subtitles of the movie ‘East of Eden’. This version contains four language versions and 22 subtitles in the DVD intended for the European market. With the console program ‘ffprobe’ the information can be retrieved in a structured way:

ffprobe -v error -show_streams East_of_Eden.mp4 | grep "type\|index\|language"
index=0
codec_type=video
TAG:language=eng
index=1
codec_type=subtitle
TAG:language=eng
index=2
codec_type=subtitle
TAG:language=ger
index=3
codec_type=subtitle
TAG:language=ger
index=4
codec_type=subtitle
TAG:language=spa
index=5
codec_type=subtitle
TAG:language=por
index=6
codec_type=subtitle
TAG:language=fre
index=7
codec_type=subtitle
TAG:language=ita
index=8
codec_type=subtitle
TAG:language=dan
index=9
codec_type=subtitle
TAG:language=fin
index=10
codec_type=subtitle
TAG:language=nor
index=11
codec_type=subtitle
TAG:language=swe
index=12
codec_type=subtitle
TAG:language=heb
index=13
codec_type=subtitle
TAG:language=pol
index=14
codec_type=subtitle
TAG:language=cze
index=15
codec_type=subtitle
TAG:language=hrv
index=16
codec_type=subtitle
TAG:language=slv
index=17
codec_type=subtitle
TAG:language=gre
index=18
codec_type=subtitle
TAG:language=hun
index=19
codec_type=subtitle
TAG:language=tur
index=20
codec_type=subtitle
TAG:language=ice
index=21
codec_type=subtitle
TAG:language=eng
index=22
codec_type=subtitle
TAG:language=ger
index=23
codec_type=audio
TAG:language=eng
index=24
codec_type=audio
TAG:language=ger
index=25
codec_type=audio
TAG:language=spa
index=26
codec_type=audio
TAG:language=eng
index=27
codec_type=data
TAG:language=eng

If, for example, only the German and English languages and the German subtitles are required, this can be accomplished as follows:

ffmpeg -i East_of_Eden.mp4 -map 0:0 -map 0:23 -map 0:24 -map 0:2 -c copy Eden.mp4

Often, however, there are two versions of a video file, the first file includes the German language, the second English. For this purpose, let’s consider the files Eden1.mp4 and Eden2.mp4:

923756 -rw-r--r-- 1 archivista archivista 945921143 Nov 23 18:28 Eden1.mp4
760936 -rw-r--r-- 1 archivista archivista 779196461 Nov 23 18:28 Eden2.mp4

Both files contain a video version and a language version each. Consequently, the video data is only needed from either Eden1.mp4 or Eden2.mp4. Additionally, the audio tracks from Eden1.mp4 and Eden2.mp4 are needed:

ffmpeg -i Eden1.mp4 -map 0:0 Eden1v.mp4
ffmpeg -i Eden1.mp4 -map 0:1 -c copy Eden1a.mp4
ffmpeg -i Eden2.mp4 -map 0:1 -c copy Eden2a.mp4
ffmpeg -i Eden1v.mp4 -i Eden1a.mp4 -i Eden2a.mp4 -c copy EdenED.mp4

This file now includes “only” the audio tracks in English and German, which is “pleasantly noticeable” in the size:

923756 -rw-r--r-- 1 archivista archivista 945921143 Nov 23 18:32 EdenED.mp4

Instead of archiving two files with the same video track (together they need about 1.7 GByte), the speech-corrected file needs less than one GByte.

The same procedure can be applied to subtitle files. This concludes the short excursion into the world of console programs. Those who prefer to work with programs with a graphical user interface will find good support in the program ‘HandBrake’ or other tools to prepare video files for archiving.