RecAPI
All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Properties Modules Pages
BAR barcode recognition module

The module can recognize particular 1D and 2D barcodes.

The module BAR requires the Recognition Add-on. See the topic on Licensing in the General Information help system.

The parts of the BAR barcode recognition module are 1D barcode recognition and 2D barcode recognition. This topic presents only information common to 1D and 2D BAR modules. The following two topics present information unique to each module.

Module name: BAR
Module identifier: RM_BAR
Filling methods supported: FM_BARCODE, FM_BARCODE2D
Filters supported: ignores all filter settings
Knowledge base files: none
Training file supported: no

Output

In LETTER structure output, the character size (width, height), the character position (top, left) and confidence data (err) information for each recognized character present barcode-level information, i.e. these values will be identical for all characters deriving from the barcode.

Coordinates (top, left, width, height) of all characters correspond to those of the barcode that contains them.

Barcode type is stored in the info member of the LETTER structure for each character.

Barcode orientation is stored in the makeup field of the LETTER structure for each character. See R_TEXT_ORIENT_MASK and related makeup bits.

Of course, the general rules are applied also by this module, i.e. the LETTER array contains zero pixel wide end-line spaces (see handling of spaces) and the character codes read from the barcode are converted to Unicode (see Barcode code page handling). EOL control codes (0x0d, 0x0a) are converted to R_ENDOFLINE makeup flags (see end-position letters). Other control codes (0x00 .. 0x1F and 0x7F .. 0x9F) are shifted to another Unicode range as defined by the setting Kernel.OcrMgr.Codes.CtrlOffset. In rare cases, even a 0x20 space code may be shifted! (This shift is necessary when the recognized text is saved in an XML file because control codes, including zero bytes, are not allowed there.)

There is a special format for avoiding conversions: This is the binary output, see below.

The end of the barcode is marked with the flag R_ENDOFPARA on the latest 0-width dummy space of the barcode. This way even multi-line barcodes (mainly 2D barcodes) can be processed: each text line is marked with R_ENDOFLINE, while the latest 0-width space of the barcode is marked with both flags.

If the barcode is empty (it could happen with some 2D barcodes), a single UNICODE_MISSING code (0xfffc) is returned, followed by the dummy space.

Barcode code page handling

One dimensional barcodes return digits or ASCII codes only, but 2D ones support more. They all have direct support for encoding accented characters in the ISO 8859-1 code page covering most Western languages. QR Code supports non-latin scripts like Japanese by design, and in practice all 2D barcodes are used (sometimes unofficially) to encode any Unicode characters.

In its default configuration OmniPage tries to detect the used code page automatically. This detection can sometimes fail, especially if a country specific 8-bit code page is used. The user can force using a specific code page the following way:

The above mentioned 3 CodePage settings work similarly:

  • Their default value is an empty string, meaning that automatic code page detection is to be performed. Note that PDF417 does auto detection only when kRecSetCodePage is not called.
  • If the value is "Auto", automatic code page detection is to be performed even if kRecSetCodePage is used.
  • If the value is "Byte", no code page conversion is to be done. All control codes and spaces are shifted with Kernel.OcrMgr.Codes.CtrlOffset, other byte values are left unchanged.
  • The value can also be any code page name known by CSDK; see Code Pages in the Engine. Either hard-coded or derived code pages can be used. The derived code page can be a user-customized one also!

The different default value handling for PDF417 is to maintain backwards compatibility. In new application we suggest you use kRecSetStringSetting(sid, "Kernel.Ocr.BAR.bar2D.PDF417.CodePage", "Auto") to enable automatic code page detection in all cases.

Binary output

The binary output can be forced globally by changing the setting Kernel.OcrMgr.BarBinary to true, or can be specified zone by zone with the kRecSetZoneBarTypes function. In binary mode the module supposes that the barcode contains binary data and skips the conversion of the result, i.e. it does not convert the codes to Unicode (e.g. a 2-byte long UTF-8 sequence will occupy 2 LETTERs), and control codes are also kept unchanged (e.g. even binary 0 can return in a LETTER). Note that the LETTER array contains a dummy zero pixel wide end-line space even in this case. This output is recommended to be used together with the DTXT output method DTXT_BINARY or the kRecGetOCRZoneText function (which remove these end-line spaces).

Different barcode types in one zone

OPSDK provides a function (kRecCheckBarTypes) to check whether elements of the enabled barcode type set are compatible each other or not. The details about the incompatible barcode types can be found in notes of this function.

Zone-by-zone barcode type enabling

OmniPage SDK 19 introduced the possibility of enabling barcode type set not only globally, but zone by zone. This can be performed by the function kRecSetZoneBarTypes for the specified user zone. The enabled barcode type selection can be retrieved for both user and OCR zones (kRecGetZoneBarTypes, kRecGetOCRZoneBarTypes).

Batch separation support

In document processing workflows special pages can be used for batch separation of different page flows. If the workflow uses barcodes for separation on such a special page, OPSDK provides a faster operation mode of BAR Module. To be enough fast and accurate this mode requires stricter dimension and quality conditions (if the image resolution is 300 DPI, bar height should be between 75 and 600 pixels). This batch separation mode can be switched on using the setting Kernel.OcrMgr.BarFastMode.

Tips for easier use of BAR module

  1. Recognizing 1D and 2D barcode types is allowed in one step in a zone: it is no more needed to use filling method FM_BARCODE and FM_BARCODE2D separately. Use of FM_BARCODE filling method makes possible to search and recognize 1D and 2D barcodes together at the same time in the same zone. See setting Kernel.OcrMgr.BarEnable1D2D.
  2. If you do not know your barcode type, we recommend to use kRecGetAutoBarTypes function. This function gets the largest set of barcode types that can be used simultaneously for an automatic detection.
  3. Full-page barcode zones: If the default recognition module is RM_BAR and there is no user zone and OCR zone at all, the engine puts a full-page barcode zone on the page and searches barcodes within it.
  4. Some non trivial barcode combinations are also possible:
    1. 1D and 2D barcode types can be used together
    2. Some not typical barcode types can be used together with 2D barcodes. E.g. BAR_PATCH barcode can be used together with any 2D barcodes.
    3. BAR_EAN, BAR_UPC_A and BAR_BOOKLAND can be used together. An automatic detection specifies the real type.
    4. BAR_C39 and BAR_C39_EXT can be used together. An automatic detection specifies the real type. (Can be disabled with the setting Kernel.OcrMgr.BarCheckBarTypes.Enable.C39_C39EXT.)
    5. BAR_C39 and BAR_C32 can be used together if the setting Kernel.OcrMgr.BarCheckBarTypes.Enable.C39_C32 is enabled. In that case an automatic detection specifies the real type. Note that some 6-character long C39 barcodes can be misdetected as C32 due to the nature of C32 encoding.
    6. BAR_C128, BAR_UCC128, BAR_EAN14 and BAR_SSCC18 can be used together. An automatic detection specifies the real type.
  5. If the real barcode type is not enabled but one of its family is, the engine can recognize the barcode. In such a case it is more useful to give a result of type within the family than not to give anything, thus the "main" member of the family can be retrieved as recognized type. If this "main" member was not enabled, the return value of the recognition is BAR_FAMILY_FORCED_WARN. For disabling this working mode set the setting Kernel.OcrMgr.BarEnableFamily to FALSE.
    For example:
    1. EAN Family - BAR_EAN ("main" member), BAR_UPC_A, BAR_BOOKLAND:
      If only BAR_UPC_A type is enabled and the real barcode type is BAR_BOOKLAND, the result of the recognition will be a BAR_EAN type barcode (instead of NO_TXT_WARN) with BAR_FAMILY_FORCED_WARN. If BAR_EAN is also enabled the return value will be REC_OK.
      For instance the barcode on the page is "9787442521744" (which is a Bookland), but only BAR_UPC_A is enabled (which always starts with '0'), then the recognition result will be "9787442521744" with type BAR_EAN and with return value BAR_FAMILY_FORCED_WARN.
    2. Code-128 Family - BAR_C128 ("main" member), BAR_UCC128, BAR_SSCC18, BAR_EAN14:
      If the barcode starts with "FNC1" function code, it has special meaning: it indicates that the barcode is a BAR_UCC128 and the following 2, 3 or 4 digits are an application identifier assigned by the Uniform Code Council. The "FNC1" string is not visible in the recognition result, but it is encoded in the barcode.
      For instance this string is written below the barcode: "(8003)09412345123452" and the barcode is encoded as: "FNC1800309412345123452". This indicates that the barcode is a BAR_UCC128 and the OCR result is "800309412345123452". (The '(' and ')' characters are not encoded in the barcode lines). The engine detects the "FNC1" prefix. If neither BAR_UCC128 nor BAR_C128 types are enabled (but any other family member is), the result depends on the setting BarEnableFamily. If it is set to TRUE, the recognition result is "FNC1800309412345123452" with type BAR_C128 and with return value BAR_FAMILY_FORCED_WARN. If it is FALSE, the return value is NO_TXT_WARN and there is no recognition result. If BAR_UCC128 is enabled the engine retrieves "800309412345123452" with the return value REC_OK.
    3. Code-39 Family - BAR_C39 ("main" member), BAR_C39_EXT, BAR_C32: BAR_C39_EXT can contain special characters with special meaning. For instance '+' character followed by a capital letter indicates the lower case pair of that letter if BAR_C39_EXT is enabled. But the '+' character followed by a number is not a valid combination for BAR_C39_EXT, but is valid for BAR_C39. If BAR_C39_EXT is enabled but BAR_C39 is not enabled, the result contains both the '+' and the following character with the type BAR_C39 and with the return value BAR_FAMILY_FORCED_WARN. However if the setting BarEnableFamily is set to FALSE, the return value is NO_TXT_WARN and there is no recognition result for the invalid character combinations.
  6. kRecCheckBarTypes works according to the below table of barcode type combinations. Yellow items depend on different settings, see above.
EAN UPC_A UPC_E ITF C39 C39_EXT C128 CB POSTNET A2of5 UCC1282of5 C93 PATCH PDF417PLANETCode32DMATRIX C39_NSS QR MAT25 CODE11ITAPOST25 MSI BOOKLAND ITF14 EAN14 SSCC18DB_LTD DB_EXP 4S_USPS 4S_AUSPOST
EAN + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
UPC_A + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
UPC_E + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
ITF + + + + + + + + - - + - + - + - + + - + - + - - + - + + - - - -
C39 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
C39_EXT + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
C128 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
CB + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
POSTNET - - - - - - - - + - - - - - + - - + - + - - - - - - - - - - + +
A2of5 + + + - + + + + - + + - + - + - + + - + - + - - + - + + - - - -
UCC128 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
2of5 + + + - + + + + - - + + + - + - + + - + - + - - + - + + - - - -
C93 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
PATCH - - - - - - - - - - - - - + + - + + - + - - - - - - - - - - - -
PDF417 + + + + + + + + + + + + + + + + + + + + - + + - + + + + + + - -
PLANET - - - - - - - - - - - - - - + + - + - + - - - - - - - - - - + +
Code 32 + + + + + + + + - + + + + + + - + + - + - + + - + + + + - - - -
DMATRIX + + + + + + + + + + + + + + + + + + + + - + + - + + + + + + - -
C39_NSS - - - - - - - - - - - - - - + - - + + + - - - - - - - - - - - -
QR + + + + + + + + + + + + + + + + + + + + - + + - + + + + + + - -
MAT25 - - - - - - - - - - - - - - - - - - - - + - - - - - - - - - - -
CODE11 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
ITAPOST25 + + + - + + + + - - + - + - + - + + - + - + + - + - + + - - - -
MSI - - - - - - - - - - - - - - - - - - - - - - - + - - - - - - - -
BOOKLAND + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
ITF14 + + + - + + + + - - + - + - + - + + - + - + - - + + + + - - - -
EAN14 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
SSCC18 + + + + + + + + - + + + + - + - + + - + - + + - + + + + - - - -
DATABAR_LTD - - - - - - - - - - - - - - + - - + - + - - - - - - - - + + - -
DATABAR_EXP - - - - - - - - - - - - - - + - - + - + - - - - - - - - + + - -
4STATE_USPS - - - - - - - - + - - - - - - + - - - - - - - - - - - - - - + -
4STATE_AUSPOST- - - - - - - - + - - - - - - + - - - - - - - - - - - - - - - +
Note
See BAR Recognition Engine Module.