Use of the OmniPage HOCR plugin to produce better OCR results.
Image enhancement support is based on the following assumptions:
Since the image enhancements such as deskew are performed in the
Page Process module with Recostar, image enhancement is done in
Page Process module only with OmniPage.
Despeckling is done by default in pre-processing of image by OmniPage. Despeckling means detecting and removing halftone
or dithering type noises.
Line removal means removing lines. If this feature is enabled, then before HOCR, horizontal and vertical rule lines are
removed.
Line removal is governed by the
OmniPage Auto Rotate/Deskew Switch option. This setting is specific to each
batch class.
Search results
Search tips
The search returns topics that contain terms you enter. If you type more than one term, an OR is assumed, which returns topics where any of the terms are found. Enclose your search terms in quotes for exact-phrase matching.
The search also uses fuzzy matching to account for partial words (such as install and installs). The results appear in order of relevance, based on how many search terms occur per topic. Exact matches are highlighted.
To refine the search, you can use the following operators:
Type + in front of words that must be included in the search or - in front of words to exclude. (Example: user +shortcut –group finds shortcut and user shortcut, but not group or user group.)
Use * as a wildcard for missing characters. The wildcard can be used anywhere in a search term. (Example: inst* finds installation and instructions.)
Type title: at the beginning of the search phrase to look only for topic titles. (Example: title:configuration finds the topic titled “Changing the software configuration.”)
For multi-term searches, you can specify a priority for terms in your search. Follow the term with ^ and a positive number that indicates the weight given that term. A higher number indicates more weight. (Example: shortcut^10 group gives shortcut 10 times the weight as group.)
To use fuzzy searching to account for misspellings, follow the term with ~ and a positive number for the number of corrections to be made. (Example: port~1 matches fort, post, or potr, and other instances where one correction leads to a match.)
Note that operators cannot be used as search terms: + - * : ~ ^ ' "