2023/I as a PDF station

With 40 lines to the ArchivistaBox as a network scanner

Egg, 11 Janaur 2023: Full of joy, a Fujitsu fi-8150 network-compatible scanner was ordered the past few days. On a positive note, the device works without any problems with the ArchivistaBox. On a less positive note, although the device has a network connection, scanning to a share folder is only possible with Windows. Other operating systems, e.g. Mac or Linux are left out. Reason enough to realize such a job with the ArchivistaBox.

The feeder is the decisive factor for scanning

Some time ago, a Brother scanner was introduced at this point. This device can be conveniently configured via a web interface and effortlessly sends the scans to a release value. Unfortunately, the Brother devices — from the feed point of view — are not convincing in the hard office routine. The Fujitsu devices simply work in a different league in terms of input. Therefore, a Fujtsu fi-8150 was ordered full of joy.

The device has USB and network connection. The device can be connected to any ArchivistaBox via USB. Unfortunately, the network connection — to put it mildly — turned out to be a loser. The IP characteristics can be set via the web interface (password is password). Unfortunately, it is not possible to specify scans in this interface, e.g. to scan to a network path.

When scanning over the network only works with Windows

If you want to scan over the network with the Fujitsu fi-8150, you need Windows and it takes quite a while until the software is installed or you can scan. This is extremely disappointing for a network-capable device. This is even more so since the device is advertised for Mac and Linux. This is also true, but only when working with USB.

Script of the ArchivistaBox “helps” the scanner on the jumps

The ArchivistaBox is first and foremost a DMS (Document Management System). However, every ArchivistaBox can be extended by a script, and so the fi-8150 was taught what should have been able to work “out of the box”, scanning to a share folder. First the script:

#!/usr/bin/perl
use lib qw(/home/cvs/archivista/jobs);
use AVJobs;
my ($host,$db,$user,$pw,$lnr) = @ARGV;
my $net = "/mnt/net";
my $smbhost = "//192.168.0.161/pdfs";
my $smbuser = "avbox";
my $smbpwd = "archivista";
logit("open $db at $host with $user for $lnr");
my $dbh=MySQLOpen($host,$db,$user,$pw);
if ($dbh) {
  if (HostIsSlave($dbh)==0) {
    logit("login in $db is sucessfully");
    my $lnr2 = ($lnr*1000)+1;
    my $sql = "select Quelle from archivbilder where Seite=$lnr2";
    my @row = $dbh->selectrow_array($sql);
    if (length($row[0])>0) {
      logit("we got a pdf file for $lnr");
      mkdir $net if !-d $net;
      my $err=0;
      my $mounted = `df | grep $net`;
      if ($mounted eq "") {
        my $cmd = "mount -t cifs $smbhost $net ".
          "-o username=$smbuser,password=$smbpwd";
        $err=system($cmd);
        logit("$err=>$cmd");
      } else {
        logit("$smbhost at $net already mounted");
      }
      if ($err==0) {
        my $fname = "$net/$db-$lnr.pdf";
        logit("try to write $fname");
        writeFile($fname,\$row[0]) if !-e "$fname";
      }
    }
  }
  $dbh->disconnect();
}

It should be added here that the entries for smbhost, smbuser and smbpwd must correspond to the share path (e.g. of the Windows computer or the Mac), the user and password there.

Further the script must be stored in such a way that it is called directly after the OCR recognition. The following path must be used on the ArchivistaBox:

/home/data/archivista/cust/autofields

Within this path, the script must firstly match the name of the database to be used for scanning. Secondly, ‘ocr.pl’ must be added to the database name. For the database ‘archivista’, the script must therefore be ‘archivistaocr.pl’.

If you now scan either via WebDMS or the keypad, the scanned pages will be copied directly into the desired directory after the text recognition. Due to the fact that the text recognition itself can run over several CPU cores or even computers, enormous speed is possible here. On the ArchivistaBox Everst, for example, about 200 pages can be processed per second.

Quiz question: 40 lines or 600 MByte of software ?

The above script comprises approx. 40 lines. Of course, the process can be adapted to suit any requirements (e.g. scanning directly to the executive’s PC). In contrast, the software supplied by Fujitsu requires over 600 MByte under Windows. The 600 MByte would probably be bearable, but the fact that absolutely no software is available for Mac or Linux, this limits the use very badly.

As a conclusion, it can be added here that the usefulness of the network interface of the 8000 series is severely limited by the scanning being restricted to Windows. The combination package of an ArchivistaBox Dolder with a Fujitsu fi-7140 costs a bit more than the Fujitsu fi-8150 scanner. In return, the combination of fi-7140 and ArchivistaBox can be integrated much more flexibly into any computer environment. Even if it is only a matter of creating PDF files as efficiently as possible.

After note: Version 2023/I is only necessary if devices from Fujitsu’s 8000 series are to be used. For creating PDF files with the 7000 series devices, the script can also be used on older ArchivistaBoxes.

« Year 202325 years of Archivista »