Word Separation Characters
The "Field delimiter" option for fuzzy databases and some page recognition profiles enables you to specify what characters can be used to identify compound words. When one of the separation characters is encountered, it is recognized and a new word starts with the following character.
For example, if a document contains a compound word like "Diagon-Alley," and you want the search to consider this compound word as two separate words, word separation characters can be specified. In this example, if a hyphen (-) is used then "diagon" and "alley" are searched and evaluated separately.
By default, fuzzy databases have a hyphen (-) and a comma (,) as their default values. Page recognition profiles however, have different default values. The following table lists the default word separation characters for each of the page recognition profiles:
Recognition Profile | Default Word Separation Characters |
---|---|
FineReader 11.1 R8 page recognition |
/:()-# |
RecoStar 7.5 page recognition |
/:()- |
Cursive (A2iA DR 6.0) page recognition * |
N/A |
Arabic (iDRS 14.1) page recognition * |
/:()-# |
Mixed Print page recognition ** |
/:()-# |
* These recognition engines are not installed by default and require additional licensing.
** If no profile is chosen as input profile for machine print.