The OCR mapping dialog

ClosedTo access this dialog:

  1. Using the XBOUND Management Center, open a console containing the Process Designer plug-in. (For information about working with consoles and with the Process Designer, see XBOUND Help.)

  2. Under Process Navigation, click the desired process.

  3. In the Process area, click Process Design:

  4. In the Process area, create or open a ABBYY FlexiCapture Extraction or RecoStar Professional Extraction process step.

  5. Click the OCR mapping button.

Use these settings to configure how documents are to be classified and extracted. In addition, these settings are used to map the read results to XBOUND fields.

Left pane

All document templates that are defined in ABBYY/RecoStar and their fields are listed in the Configuration column. The XBOUND object name column lists which XBOUND document types and fields that the ABBYY/RecoStar objects are mapped to.

Click a line to change the XBOUND mapping. The Choose XBOUND object dialog is displayed, where you can select the XBOUND objects to be mapped, add a new XBOUND object, or delete the XBOUND mapping. Objects to be created in XBOUND are displayed in blue italics; these are created after this dialog is closed.

ClosedAdditional information about mapping for the RecoStar Professional Extraction activity:

  • RecoStar's BoxReadField objects are supported. This is a recognition operator that searches for a (possibly subdivided) box within a specified geometrical zone and reads its contents. The result is stored in XBOUND as OCRResult and as the value of the field that you map to BoxReadField.

  • PixelCountField objects are also supported. This operator counts the black pixels in the region defined by the Zone property.

    Note

    • The value of the field that you map to PixelCountField is X if the value is between PixelMinRatio and PixelMaxRatio; otherwise it is O (the letter "O").

    • Confidence Rank is saved to FieldValue as "OK" if PixelCountField.IsInRange = true, otherwise "Reject".

    • The value of the PixelCountField field itself is written as OMRResult in EngineResult as follows:

      • OMRResult.Blackness = PixelCount.Percentage

      • OMRResult.Checked = PixelCount.IsInRange

      • OMRResult.Confidence = 1 if PixelCount.IsInRange = true, otherwise 0.

Right pane
Classification
  • None: ABBYY/RecoStar is not used to classify the document/medium. For extraction, ABBYY uses the document template that is assigned to the current document type.
    (ClosedSequence if you select this option when using ABBYY FlexiCapture Extraction.)

    1. XBOUND transfers all document media as one batch to ABBYY.

    2. If Check if number of images is same as in document definition is selected in the main activity settings, the activity calculates how many images are expected in the ABBYY document definition, which is mapped to the current XBOUND document type (this is equal to the number of pages of all document sections).

      • If the number of images of the current document is the same as the expected number, processing can start.

      • If the number is not the same, an error is thrown ("Number of pages [99] of document template is not the same as number of images at document [99]").

    3. Processing the batch (current document) in ABBYY.

    4. Classification results are applied as the document type unless Save classification result in field is selected. The XBOUND document type that is mapped to the classified ABBYY document definition is applied to the current document.

    5. The results of all document sections, field areas, fields, and tables are evaluated. If a field/table is mapped to an XBOUND field/table, the results are applied to this mapped field or table.

    6. All field/table cell values are applied as follows. For each field/cell, the image and position on image is applied. For text fields, recognized characters and their confidence levels are applied, too.

      Field type

      Applied value

      FT_CheckmarkAsBoolean
      FT_CheckmarkGroupAsString
      FT_CurrencyFieldAsString

      FT_DateTimeField

      AsDateTime als LongDateString

      FT_Document

      Not implemented (not applied to an XBOUND field)

      FT_Group

      Not implemented (not applied to an XBOUND field)

      FT_NumberField

      AsDouble

      FT_PageGroup

      Not implemented (not applied to an XBOUND field)

      FT_PictureField

      Not implemented (not applied to an XBOUND field)

      FT_TextField

      AsString

      OMR field values are mapped to 0 or 1.

  • Classify document: ABBYY/RecoStar is used for classifying and extracting the document. For RecoStar processing of multi-page documents, see the description of the Process document as multi-page form setting in RecoStar Professional Extraction activity.
    (ClosedSequence if you select this option when using ABBYY FlexiCapture Extraction.)

    1. XBOUND transfers all document media as one batch to ABBYY.

    2. Processing batch in ABBYY including classification.

    3. Classification results are applied as the document type unless Save classification result in field is selected. The XBOUND document type that is mapped to the classified ABBYY document definition is applied to the current document.

    4. In case of successful classification, all fields and tables are applied as described above under None.

  • Classify media: Only a document's media are classified. In this case, extraction is not possible.
    (ClosedSequence if you select this option when using ABBYY FlexiCapture Extraction.)

    1. Each single medium is transferred to ABBYY for single processing.

    2. Processing of current medium in ABBYY including classification.

    3. The XBOUND document type that is mapped to the classified ABBYY document definition is applied to the current medium.

    No read results are applied.

  • Save classification result in field: The result of the ABBYY/RecoStar classification is saved in a field, but the document type is not changed. Then select a target field in the drop-down list. (The list is created automatically for each document type.)

Non-printable characters

How non-displayable characters are handled in the extraction results. You can replace them with a question mark, or you can delete them from the results.

This is not used in RecoStar.

Filter

If the Hide information without XBOUND context option is selected, all entries that are not assigned to XBOUND objects are hidden.

ABBYY-specific information

The FC10 processor processing model has replaced the FC10 project processing model. As a result, the document definitions are extracted from the project file and transferred to the FC10 processor. Therefore, the project settings are not relevant anymore.

Deskewing and rotating images in ABBYY are not supported. If needed, these two image-processing functions are to be done in advance in separate XBOUND process steps. Changed images are not applied to XBOUND after processing in XBOUND. Only read and classification results are applied. Deskewing or rotating images in ABBYY would cause Verification to show the wrong field positions.

ABBYY FlexiCapture Extraction activity

RecoStar Professional Extraction activity

© 2018 Kofax, Inc. All rights reserved. | Terms of Use