GRAB -- GNU Recognizion Apparatus for Barcodes
----------------------------------------------
version 0.0.4

The purpose of this program is to read a (scanned) image containing a
barcode. The image is then interpreted and the barcode is decoded.
The decoded string is outputted to stdout.

Currently, only Code-128 is supported.

Only rawbits-PGM files are supported, but you may use Netpbm or ImageMagick
to convert images from other formats.

The program is designed to work well with the pbmscan program, so
you can pipe image directly from your scanner to decode2, which does 
the actual work. The result can be piped to xclip program, which pastes
it to the X clipboard. You may then paste the string into any X program.
The included Bash script xbarcode demonstrates this.

To compile, just type "make". To install, copy the files decode2 and
xbarcode, if you like, somewhere in your path.

There are also some Matlab scripts there, if you want to play with them.

Internal workings:
------------------
The grayscale image is processed one scanline at a time. First it is
interpolated to a higher resolution and then thresholded into a binary 
image and the run widths of alternating black and white stripes are counted.
This all is done at one step, so no memory is required for the high resolution
temporary image. The threshold value may either be the mean of the whole
scanline, or it may be adaptively adjusted for each pixel. The adaptive
code makes it possible to recognize barcodes where the image lightness is
varying, but unfortunately it makes recognizion of flatly lighted barcodes
more difficult, so by default it is not used.
(I made some tests, and surprisingly interpolation makes the program
recognize slightly worse the barcodes. So it is disabled by default)

A general function to compute the normalized mean squared error (MSE) of
some code in the code-128 code set in the run widths, starting from a specified 
location is compute_mse(). This function is used in two places, for finding
the start of the barcode and for recognizing each code. The function also
returns the module width (the width of the narrowest possible bar) if the
detection of the code is assumed positive.

The beginning of the barcode is searched from the run widths. It may be
either the start code 103--105 or the end code 106 reversed, in which case
it is assumed that the barcode is horizontally flipped. The MSE is computed
for all four possible codes and for each position in the run widths. The
code and position that gives the lowest MSE is selected. The associated
module width is saved. (Reverse recognizion is not yet implemented)

At this point, by default, are made some sanity checks. First, the selected
MSE is compared to some value (0.1). If the MSE is too big, we assume that
it isn't a real match and declare that no barcode were found. Second, the
ratio of the best MSE and the second best MSE is computed. If they are too
close each other, the confidence is low and also in this case the barcode
is rejected.

Now that the start of the barcode has been found, the rest of the barcode
is decoded. The MSE of each of the codes in the code-128 set is computed
and the code with the lowest MSE selected. Sanity checks are again done,
if desired, including not only the MSE of the detected code but also the 
module width of the code compared to the module width of the start code.

When the end-of-the-barcode code is detected, the next phase verifies
the checksum and translated the codes into an ASCII string, which is
then printed.

Tricky part: my scanner is exactly 105 mm wide. The Finnish bank barcode
standard defines the maximum width of the barcodes to be exactly 105 mm.
What a coincidence. Unfortunately, many barcodes are illegal and 106 mm
wide which makes them impossible to scan at one pass. So there's code
that allows you to scan them in two passes: when only left part of a barcode 
is detected, it is compared to saved right parts of barcodes, to find the 
corresponding part which allows building the whole barcode.
(This section is to do; not yet implemented)

Benchmarking:
-------------
Define BENCHMARK in the C code, and run the program with bartest-0.1.pgm.
The program tries to recognize each scanline in the test image, and
displays how many were successful. Result image will be written to
result.pgm, which is the original image plus a black mark at the
right edge for scanlines that were succesfully recognized.

References:
-----------
xclip:		http://www.mercuryit.com.au/~kims/xclip/
GNU Barcode:	ftp://ftp.gnu.org/pub/gnu/barcode/
Netpbm:		???
Pbmscan:	???
Scanner driver:	http://www.ee.oulu.fi/~tuukkat/releases.html

- tuukkat@ee.oulu.fi 2001-11-04
