RecoStar HOCR for Linux (Beta)
The RECOSTAR_HOCR plugin improves OCR accuracy for non-English languages. It also provides more pre-OCR image processing flexibility for non-English languages.
This is the first part of a multi-phased approach to provide additional RecoStar-based plugins for ingestion, classification, extraction, and export. This will include, but may not be limited to the following:
-
Document import
-
Barcode extraction
-
Fixed form extraction
-
Native ICR/OMR configuration
-
Batch export
The RECOSTAR_HOCR plugin for Linux will remain in beta until it is certified for production or Ephesoft determines that it is feature complete based on feedback from our customers. The plugin is provided out-of-box and is optional to use. The RECOSTAR_HOCR plugin should not be used in place of the OMNIPAGE_HOCR plugin unless a specific use case requires it.
Limitations
The RECOSTAR_HOCR plugin for Linux has the following known limitations:
-
No EText support
-
Performance is reduced by up to 30% compared to RECOSTAR_HOCR for Windows
-
The plugin has no bar code support and should remain OFF
Other RECOSTAR_HOCR plugin features that are available in the Windows version of the plugin are not yet supported in the Linux version. These include, but are not limited to, the following:
-
Fixed form extraction
-
Bar code extraction
-
Native key-value snippet ICR extraction
-
WebServices import or export capabilities
Prerequisites
To configure and use the RECOSTAR_HOCR plugin, the following configurations must be in place:
-
You need Transact installed.
-
You need a batch class with a document type configured. For detailed steps, see Add new document type.
-
You need to add the RECOSTAR_HOCR plugin to the Page Process module for the batch class. For more detailed steps, see Modules and plugins.
Remove any other HOCR plugins from the batch class Page Process module.
Configure the RECOSTAR_HOCR plugin
This section provides information on how to configure the RECOSTAR_HOCR plugin. This plugin only needs to be configured once per batch class.
To configure the plugin, do the following:
- From the Batch Class Management page, select and open your batch class.
-
Go to
Modules and select the
Page Process module folder.
The Plugin Configuration screen appears.
- From the Plugin Configuration, locate the RECOSTAR_HOCR plugin in the Associated Plugins pane.
- Select the plugin and click the Add Selected icon to move it to the Selected Plugins pane.
- Click Deploy.
-
Expand the
Page Process module folder and select the
RECOSTAR_HOCR plugin. The
Plugin Configuration screen appears.
The following table lists the configurable properties for this plugin.
Configurable property
Options
Descriptions
Image OCR Recostar Project File Name
-
Fpr.rsp
-
Fpr_MultiLanguage.rsp
-
Fpr_Barcode.rsp
This option is used to specify the project file name used for performing OCR.
Recostar Auto Rotate switch
-
ON
-
OFF
This property is used to auto-rotate the input images on the basis of orientation computed by the RecoStar project.
Recostar Switch
-
ON
-
OFF
Use this switch to enable or disable the plugin.
Barcode Switch
-
ON
-
OFF
Ensure this switch is set to OFF due to limitations in this beta. This property is used to read the bar code from the input images using the barcode-enabled RecoStar project FPR_Barcode.rsp file.
Recostar Deskew Switch
-
ON
-
OFF
This switch determines whether or not input images must be deskewed.
Recostar Font Switch
-
ON
-
OFF
The RecoStar Font Switch allows the user to detect any data that has been manually altered or added to the documents. By default, the Font Switch is set to OFF.
OCR/Country/Language
Multiple countries and languages
Type the country, countries, language, or languages that need to be supported during OCR operations. When adding multiple values, separate each value with a semicolon (;) and no space. The system populates a drop-down menu when you start typing a value in the field.
-