OmniPage Zone Profile Settings window
You can use this window to set the properties for the OmniPage zone recognition. The following groups of settings are available for configuration.
The following buttons are available at the bottom of this window:
Button |
Description |
---|---|
OK |
Closes the window and saves your changes. |
Cancel |
Closes the window without saving your changes. |
![]() |
Displays the help for the open window. |
Languages
This setting enables you to select one or more languages. You can combine any languages. Only the English language is selected by default.
As you select additional languages, the language count in the header increases. Not all languages are visible and this ensures that you are aware that some languages lower down in the list are selected.
If you have purchased the Asian Plus OCR add-on, additional languages are supported.
General Settings
This group has the following settings:
- Character Type
-
You can filter the type of extraction used for a zone by selecting one of the following.
-
All. (Default: Selected)
-
Numeric.
-
Alphabetic.
-
- Recognition Mode
-
Select the Recognition mode from the following values.
-
Fast.
Choose this value to prioritize speed of recognition over the recognition accuracy. If you choose this value, the recognition process takes less time, but the overall recognition accuracy may decrease.
-
Balanced. (Default: Selected)
Choose this value to balance the priority of the recognition accuracy with the speed of recognition.
-
Accurate.
Choose this value if you want to prioritize the recognition accuracy over the speed of recognition. If you choose this value, the recognition process takes longer.
-
- MRZ Mode
-
When selected, a specialized recognition engine processes the zone for MRZ text and the Print Type is set to OCR-B automatically. This is because OCR-B is the font used by MRZ.
Any non-MRZ text in this zone is likely to have low recognition accuracy. Because of this, if you want to recognize other non-MRZ text on the document, create a new zone and then use another zone profile to ensure the best recognition possible.
- Case Conversion
-
This setting is available when the Character Type setting has a value of All or Alphanumeric. If the value is set to Numeric, this setting is disabled.
This setting to convert the case of recognized text is not supported for Asian languages.
Select from one of the following values.
-
Auto Case. (Default: Selected)
Select this value when the case of the text in a zone is a mix of upper or lower case. No case conversion is applied.
-
Small Case.
Select this value to convert the recognized text in a zone to lower case.
-
Capital Case.
Select this value to convert the recognized text in a zone to upper case or capital letters.
-
- Prohibited Characters
-
This is a non-delimited list of characters that should not be found in a zone. Prohibited characters are applied after the case conversion is applied and the characters are case sensitive. If one of these characters is encountered, it is assumed that a mistake exists in the recognition results, and the character is replaced with a ^ character. If OCR Substitution is defined for the mistaken character, its alternative is used instead. All characters entered are literal, meaning that ranges are not supported.
For example, if you enter "Apf" for the Prohibited Characters setting, the characters "A", "p", and "f" are prohibited.
If you enter "A-F", the characters "A" and "F" are prohibited, as well as the "-" hyphen character. To prohibit an entire range, enter each character of that range.
- Additional Characters
-
Add a list of characters that are accepted in addition to the selected language. This is a comma-separated list.
For example, if you have content printed in the MICR E-13B font and are using the OmniPage recognition engine, add '⑆', '⑈', '⑇' and '⑉' as additional characters. If these symbols are not added as additional characters, they are not included in the recognition result.
Similarly, if your language is set to English and some characters with accents or diacritical marks are not recognized as expected, add them to the list of additional characters.
Printer Type
This group enables you to select the types of print or fonts to expect. Choose one or more of the following values:
-
Auto. (Default: Selected)
-
Machine Print.
-
Handprint.
-
OCR-A.
-
OCR-B.
-
Dot 9 Matrix.
-
Dot 24 Matrix.
-
MICR CMC-7.
-
MICR E-13B.
Dictionaries
This group has the following settings:
- Language dictionary
-
Select to run spell checking during recognition for one or more of the following languages.
- Brazilian
- Catalan
- Czech
- Danish
- Dutch
- English
- Esperanto
- Finnish
- French
- German
- Greek
- Hungarian
- Italian
- Norwegian
- Polish
- Portuguese
- Russian
- Slovenian
- Spanish
- Swedish
- Turkish
- User dictionary
-
Select this setting to specify your own dictionary that is used to aid spell checking during recognition. It is important to know that a misspelled word in the recognition results is not always replaced with a corresponding entry from a user dictionary.
Once selected, browse to select a dictionary.
User dictionaries are not supported for the Asian Plus OCR add-on languages.
(Default: None)
- Business terms dictionary
-
Select to use a dictionary and then select a business terms dictionary from the list. This dictionary can improve spell checking during recognition using the contents of the selected dictionary. The following dictionaries containing business terms are available. If there is not a suitable setting here, use the User dictionary setting above and specify your own list of terms. (Default: None)
-
English Financial Dictionary
-
Dutch Medical Dictionary
-
English Medical Dictionary
-
French Medical Dictionary
-
German Medical Dictionary
-
Dutch Legal Dictionary
-
English Legal Dictionary
-
French Legal Dictionary
-
German Legal Dictionary
-
Related topics: