Training Image
The training image generation is the precursor to training. It creates an image in a specific format in an arbitrary font to allow it to read and collect data without specifying an individually labeled character image.
Generation
For reference, the following is an example of a training image with a font family of Comic Sans MS with otherwise default settings:
Due to the font string being hardcoded into the OCR, it can not currently be changed, though there is an open issue to add arbitrary alphabet support.
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~W W
Special Formatting Cases
The training string is meant to provide the most common ASCII characters that may be tested for, though there is an anomaly that one may notice at the end. W
's separated by a space are meant to give an estimate of the average spacing between characters. During training, when it hits the second W
in the string, it keeps track of the farthest right X coordinate. Then when it hits the last W
, it subtracts the beginning of that character with the stored X value to get the space width.