RecAPI
|
The module can recognize particular 1D and 2D barcodes.
The module BAR requires the Recognition Add-on. See the topic on Licensing in the General Information help system.
The parts of the BAR barcode recognition module are 1D barcode recognition and 2D barcode recognition. This topic presents only information common to 1D and 2D BAR modules. The following two topics present information unique to each module.
Module name: | BAR |
Module identifier: | RM_BAR |
Filling methods supported: | FM_BARCODE, FM_BARCODE2D |
Filters supported: | ignores all filter settings |
Knowledge base files: | none |
Training file supported: | no |
In LETTER structure output, the character size (width
, height
), the character position (top
, left
) and confidence data (err
) information for each recognized character present barcode-level information, i.e. these values will be identical for all characters deriving from the barcode.
Coordinates (top
, left
, width
, height
) of all characters correspond to those of the barcode that contains them.
Barcode type is stored in the info member of the LETTER
structure for each character.
Barcode orientation is stored in the makeup
field of the LETTER
structure for each character. See R_TEXT_ORIENT_MASK and related makeup bits.
Of course, the general rules are applied also by this module, i.e. the LETTER
array contains zero pixel wide end-line spaces (see handling of spaces) and the character codes read from the barcode are converted to Unicode (see Barcode code page handling). EOL control codes (0x0d, 0x0a) are converted to R_ENDOFLINE makeup flags (see end-position letters). Other control codes (0x00 .. 0x1F and 0x7F .. 0x9F) are shifted to another Unicode range as defined by the setting Kernel.OcrMgr.Codes.CtrlOffset. In rare cases, even a 0x20 space code may be shifted! (This shift is necessary when the recognized text is saved in an XML file because control codes, including zero bytes, are not allowed there.)
There is a special format for avoiding conversions: This is the binary output, see below.
The end of the barcode is marked with the flag R_ENDOFPARA on the latest 0-width dummy space of the barcode. This way even multi-line barcodes (mainly 2D barcodes) can be processed: each text line is marked with R_ENDOFLINE, while the latest 0-width space of the barcode is marked with both flags.
If the barcode is empty (it could happen with some 2D barcodes), a single UNICODE_MISSING code (0xfffc) is returned, followed by the dummy space.
One dimensional barcodes return digits or ASCII codes only, but 2D ones support more. They all have direct support for encoding accented characters in the ISO 8859-1 code page covering most Western languages. QR Code supports non-latin scripts like Japanese by design, and in practice all 2D barcodes are used (sometimes unofficially) to encode any Unicode characters.
In its default configuration OmniPage tries to detect the used code page automatically. This detection can sometimes fail, especially if a country specific 8-bit code page is used. The user can force using a specific code page the following way:
The above mentioned 3 CodePage
settings work similarly:
"Auto"
, automatic code page detection is to be performed even if kRecSetCodePage is used."Byte"
, no code page conversion is to be done. All control codes and spaces are shifted with Kernel.OcrMgr.Codes.CtrlOffset, other byte values are left unchanged.The different default value handling for PDF417 is to maintain backwards compatibility. In new application we suggest you use kRecSetStringSetting(sid, "Kernel.Ocr.BAR.bar2D.PDF417.CodePage", "Auto")
to enable automatic code page detection in all cases.
The binary output can be forced globally by changing the setting Kernel.OcrMgr.BarBinary
to true, or can be specified zone by zone with the kRecSetZoneBarTypes function. In binary mode the module supposes that the barcode contains binary data and skips the conversion of the result, i.e. it does not convert the codes to Unicode (e.g. a 2-byte long UTF-8 sequence will occupy 2 LETTERs), and control codes are also kept unchanged (e.g. even binary 0 can return in a LETTER
). Note that the LETTER
array contains a dummy zero pixel wide end-line space even in this case. This output is recommended to be used together with the DTXT output method DTXT_BINARY or the kRecGetOCRZoneText function (which remove these end-line spaces).
OPSDK provides a function (kRecCheckBarTypes) to check whether elements of the enabled barcode type set are compatible each other or not. The details about the incompatible barcode types can be found in notes of this function.
OmniPage SDK 19 introduced the possibility of enabling barcode type set not only globally, but zone by zone. This can be performed by the function kRecSetZoneBarTypes for the specified user zone. The enabled barcode type selection can be retrieved for both user and OCR zones (kRecGetZoneBarTypes, kRecGetOCRZoneBarTypes).
In document processing workflows special pages can be used for batch separation of different page flows. If the workflow uses barcodes for separation on such a special page, OPSDK provides a faster operation mode of BAR Module. To be enough fast and accurate this mode requires stricter dimension and quality conditions (if the image resolution is 300 DPI, bar height should be between 75 and 600 pixels). This batch separation mode can be switched on using the setting Kernel.OcrMgr.BarFastMode.
FALSE
.BAR_UPC_A
type is enabled and the real barcode type is BAR_BOOKLAND
, the result of the recognition will be a BAR_EAN
type barcode (instead of NO_TXT_WARN) with BAR_FAMILY_FORCED_WARN
. If BAR_EAN
is also enabled the return value will be REC_OK. Bookland
), but only BAR_UPC_A
is enabled (which always starts with '0'), then the recognition result will be "9787442521744" with type BAR_EAN
and with return value BAR_FAMILY_FORCED_WARN
."FNC1"
function code, it has special meaning: it indicates that the barcode is a BAR_UCC128
and the following 2, 3 or 4 digits are an application identifier assigned by the Uniform Code Council. The "FNC1" string is not visible in the recognition result, but it is encoded in the barcode.BAR_UCC128
and the OCR result is "800309412345123452". (The '(' and ')' characters are not encoded in the barcode lines). The engine detects the "FNC1" prefix. If neither BAR_UCC128
nor BAR_C128
types are enabled (but any other family member is), the result depends on the setting BarEnableFamily
. If it is set to TRUE
, the recognition result is "FNC1800309412345123452" with type BAR_C128
and with return value BAR_FAMILY_FORCED_WARN
. If it is FALSE
, the return value is NO_TXT_WARN
and there is no recognition result. If BAR_UCC128
is enabled the engine retrieves "800309412345123452" with the return value REC_OK.BAR_C39_EXT
can contain special characters with special meaning. For instance '+' character followed by a capital letter indicates the lower case pair of that letter if BAR_C39_EXT
is enabled. But the '+' character followed by a number is not a valid combination for BAR_C39_EXT
, but is valid for BAR_C39
. If BAR_C39_EXT
is enabled but BAR_C39
is not enabled, the result contains both the '+' and the following character with the type BAR_C39
and with the return value BAR_FAMILY_FORCED_WARN
. However if the setting BarEnableFamily
is set to FALSE
, the return value is NO_TXT_WARN
and there is no recognition result for the invalid character combinations.EAN | UPC_A | UPC_E | ITF | C39 | C39_EXT | C128 | CB | POSTNET | A2of5 | UCC128 | 2of5 | C93 | PATCH | PDF417 | PLANET | Code32 | DMATRIX | C39_NSS | QR | MAT25 | CODE11 | ITAPOST25 | MSI | BOOKLAND | ITF14 | EAN14 | SSCC18 | DB_LTD | DB_EXP | 4S_USPS | 4S_AUSPOST | |
EAN | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
UPC_A | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
UPC_E | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
ITF | + | + | + | + | + | + | + | + | - | - | + | - | + | - | + | - | + | + | - | + | - | + | - | - | + | - | + | + | - | - | - | - |
C39 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
C39_EXT | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
C128 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
CB | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
POSTNET | - | - | - | - | - | - | - | - | + | - | - | - | - | - | + | - | - | + | - | + | - | - | - | - | - | - | - | - | - | - | + | + |
A2of5 | + | + | + | - | + | + | + | + | - | + | + | - | + | - | + | - | + | + | - | + | - | + | - | - | + | - | + | + | - | - | - | - |
UCC128 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
2of5 | + | + | + | - | + | + | + | + | - | - | + | + | + | - | + | - | + | + | - | + | - | + | - | - | + | - | + | + | - | - | - | - |
C93 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
PATCH | - | - | - | - | - | - | - | - | - | - | - | - | - | + | + | - | + | + | - | + | - | - | - | - | - | - | - | - | - | - | - | - |
PDF417 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | - | + | + | - | + | + | + | + | + | + | - | - |
PLANET | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | + | - | + | - | + | - | - | - | - | - | - | - | - | - | - | + | + |
Code 32 | + | + | + | + | + | + | + | + | - | + | + | + | + | + | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
DMATRIX | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | - | + | + | - | + | + | + | + | + | + | - | - |
C39_NSS | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | - | - | + | + | + | - | - | - | - | - | - | - | - | - | - | - | - |
QR | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | - | + | + | - | + | + | + | + | + | + | - | - |
MAT25 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | - | - | - | - | - | - | - | - | - | - | - |
CODE11 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
ITAPOST25 | + | + | + | - | + | + | + | + | - | - | + | - | + | - | + | - | + | + | - | + | - | + | + | - | + | - | + | + | - | - | - | - |
MSI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | - | - | - | - | - | - | - | - |
BOOKLAND | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
ITF14 | + | + | + | - | + | + | + | + | - | - | + | - | + | - | + | - | + | + | - | + | - | + | - | - | + | + | + | + | - | - | - | - |
EAN14 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
SSCC18 | + | + | + | + | + | + | + | + | - | + | + | + | + | - | + | - | + | + | - | + | - | + | + | - | + | + | + | + | - | - | - | - |
DATABAR_LTD | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | - | - | + | - | + | - | - | - | - | - | - | - | - | + | + | - | - |
DATABAR_EXP | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | - | - | + | - | + | - | - | - | - | - | - | - | - | + | + | - | - |
4STATE_USPS | - | - | - | - | - | - | - | - | + | - | - | - | - | - | - | + | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + | - |
4STATE_AUSPOST | - | - | - | - | - | - | - | - | + | - | - | - | - | - | - | + | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + |