Arabic Page Recognition Profile Settings Window

The following options are found in the Arabic page recognition properties window.

General Settings

This group has the following options:

Word separation characters

Use this field to define what characters may delimit words. The value for this option is set to /:()-# (forward slash, colon, open and close parentheses, hyphen, pound) by default.

Correct separated numbers

Select this option to automatically combine numbers or groups of numbers that are close together. For example, if the engine reads "12," and "00," as two words less than half a space apart, it returns a single combined word - "12,00". This option is selected by default.

Secondary language

Select a secondary language from the list for documents that contain digits or other English or French text in addition to the Arabic text. The value for this option is set to <None> by default.

Arabic Recognition Environment

This group has the following options:

Reliability rate

By default, if no font is selected a font-independent recognition is performed. When a font is selected from the Arabic fonts an additional font-dependent recognition is executed. The reliability rate defines the weight for the font-independent versus the font-dependent results in order to return the result with the best confidence.

Use this option to define the rate for the reliability. The value for this option is set to 50 by default.

Arabic fonts

If your documents contain several fonts you can enable font-dependant recognition by selecting different fonts from the list. For the best results, you should enable multiple fonts only if the fonts used on the documents are known. This is because a random selection of fonts can slow down the recognition process, and may reduce the accuracy rate. When you do enable multiple fonts, set the reliability rate to 50 so that the font-dependent and font-independent recognition is rated equally.

Image Preprocessing

This group has the following options to increase the integrity of images extracted:

Deskew

Select this option to automatically correct slightly slanted images. This option is cleared by default.

Despeckle

Select this option to remove blobs on an image. If selected, it automatically removes all groups of connected pixels with a number of pixels below the configured Maximum pixel size value. This option is cleared by default.

Maximum pixel size

If Despeckle is selected, you can define a maximum number of connected pixels. Any number of connected pixels below the defined value are removed as noise. The value for this option is set to 1 by default.

Definitions for the buttons at the bottom of this window can be found in Common Transformation Designer Buttons.