RecAPI
Classes | Typedefs | Enumerations | Functions
Recognition Data Handling Module
KernelAPI

Letter handling tools. More...

Classes

struct   LSPC
  Additional information about the space character. More...
struct   LETTER
  The LETTER structure. More...

Typedefs

typedef LETTER LPLETTER
  Pointer to a structure LETTER.
typedef const LETTER LPCLETTER
  Const pointer to a structure LETTER.

Enumerations

enum   LETTERSTRENGTH {
  LTS_FINAL,
  LTS_STRONG,
  LTS_MEDIUM,
  LTS_WEAK,
  LTS_SIZE
}
  Possible places where letter array is to be copied to. More...

Functions

RECERR RECAPIKRN  kRecGetLetters (HPAGE hPage, IMAGEINDEX iiImage, LPLETTER *ppLetter, LPLONG pLettersLength)
  Getting recognition result.
RECERR RECAPIKRN  kRecGetLetterPalette (HPAGE hPage, REC_COLOR **ppColours, LPLONG pNum)
  Getting palette of recognition data.
RECERR RECAPIKRN  kRecGetChoiceStr (HPAGE hPage, WCHAR **ppChoices, LPLONG pLength)
  Getting choices.
RECERR RECAPIKRN  kRecGetSuggestionStr (HPAGE hPage, WCHAR **ppSuggestions, LPLONG pLength)
  Getting suggestions.
RECERR RECAPIKRN  kRecGetFontFaceStr (HPAGE hPage, char **ppFontFaces, LPLONG pLength)
  Getting font faces.
RECERR RECAPIKRN  kRecSetLetters (LETTERSTRENGTH towhere, HPAGE hPage, IMAGEINDEX iiImage, LPCLETTER pLetter, LONG LettersLength)
  Putting a letter buffer onto the input of the PLUS2W and PLUS3W engines or the selected output converter.
RECERR RECAPIKRN  kRecFreeRecognitionData (HPAGE hPage)
  Freeing recognition data.

LETTER::fontAttrib field elements

Possible values of LETTER::fontAttrib field.

#define  R_NO_ITALIC   0x0001
  Not-Italic character. It is not possible for both R_ITALIC and R_NO_ITALIC to be set. If both are unset we do not know whether it is Italic or not.
#define  R_ITALIC   0x0002
  Italic character. See also R_NO_ITALIC.
#define  R_NO_BOLD   0x0004
  Not-Bold character. It is not possible for both R_BOLD and R_NO_BOLD to be set. If both are unset we do not know whether it is Bold or not.
#define  R_BOLD   0x0008
  Bold character. See also R_NO_BOLD.
#define  R_SANSSERIF   0x0010
  Sans Serif character. It is not possible for both R_SANSSERIF and R_SERIF to be set. If both are unset we do not know whether it is Serif or not.
#define  R_SERIF   0x0020
  Serif character. See also R_SANSSERIF.
#define  R_PROPORTIONAL   0x0040
  Proportional character. It is not possible for both R_PROPORTIONAL and R_MONOSPACED to be set. If both are unset we do not know whether it is Monospaced or not.
#define  R_MONOSPACED   0x0080
  Monospaced character. See also R_PROPORTIONAL.
#define  R_SMALLCAPS   0x0100
  Character in a Small Caps word. The code is always upper case! See also RR_SMALLCAPS_TALL in the field info.
#define  R_UNDERLINE   0x0200
  Underlined character.
#define  R_STRIKETHROUGH   0x0400
  Struck through character. It is not used. It is only for future versions.
#define  R_SUBSCRIPT   0x0800
  Subscript character.
#define  R_SUPERSCRIPT   0x1000
  Superscript character.
#define  R_DROPCAP   0x2000
  Dropcap character.
#define  R_POPCAP   0x4000
  Popcap character.
#define  R_INVERTED   0x8000
  Inverted character.

LETTER::info field macros

Macros can be used with LETTER::info field.

#define  RH_OCRENGINE(info)   ((RECOGNITIONMODULE)(((info) & RH_OCRENGINE_MASK) >> 5))
  Getting the RECOGNITIONMODULE from the field info. This is the module ID of the engine that actually recognized the given character. With the PLUS engines this is usually RM_RESERVED_M.
#define  RH_OCRENGINE_SET(oeng)   (((UINT)(oeng)) << 5)
  Setting the RECOGNITIONMODULE into the field info.
#define  RH_OCRTYPE(info)   ((FILLINGMETHOD)(((info) & RH_OCRTYPE_MASK) >> 10))
  Getting the FILLINGMETHOD from the field info.
#define  RH_OCRTYPE_SET(otype)   (((UINT)(otype)) << 10)
  Setting the FILLINGMETHOD into the field info.
#define  RH_BARTYPE(info)   ((BAR_TYPE)(((info) & RH_BARTYPE_MASK) >> 24))
  Getting the BAR_TYPE from the field info.
#define  RH_BARTYPE_SET(btype)   (((UINT)(btype)) << 24)
  Setting the BAR_TYPE into the field info.

Info field bits

Possible flags of LETTER::info field.

#define  RR_BULLET   0x00000001
  Bullet character at bullet position.
#define  RR_SOFTHYPHEN   0x00000004
  Soft hyphen.
#define  RH_OCRENGINE_MASK   0x000003E0
  Mask of RECOGNITIONMODULE.
#define  RH_OCRTYPE_MASK   0x00007C00
  Mask of FILLINGMETHOD.
#define  RH_GTMTCH   0x00008000
  Internal use only.
#define  RR_CONFIDENT_CHAR   0x00010000
  Internal use only.
#define  RR_DISABLED_CHAR   0x00020000
  Internal use only.
#define  RR_VOTED_CHAR   0x00040000
  Internal use only.
#define  RR_NOISY_CHAR   0x00080000
  Internal use only.
#define  RR_EXPANDED   0x00100000
  Internal use only.
#define  RH_MANGO_ISOLATED_CH   0x00200000
  Internal use only.
#define  RH_LA_INTERNAL   0x00400000
  Internal use only.
#define  RH_LA_EXTERNAL   0x00800000
  NO LONGER USED.
#define  RR_PDMERGE_CHAR   0x00800000
  Internal use only.
#define  RH_BARTYPE_MASK   0x3F000000
  Mask of BAR_TYPE.
#define  RR_DICTIONARY_WORD   0x40000000
  Dictionary word. It is set when the word is in at least one dictionary of the currently used ones. See Language of a word.
#define  RR_SMALLCAPS_TALL   0x80000000
  A tall character among the small capitals. See also R_SMALLCAPS in the field fontAttrib.

LETTER::makeup field elements

Flags of end-position LETTERs (see usage of them in the table here) and direction/orientation flags (see also the section about vertical text support).

#define  R_ENDOFLINE   0x0001
  End of line. In a table zone, the end of all the lines of a cell is marked by this flag.
#define  R_ENDOFPARA   0x0002
  End of paragraph. This flag is used by BAR module only.
#define  R_ENDOFWORD   0x0004
  End of word.
#define  R_ENDOFZONE   0x0008
  End of zone.
#define  R_ENDOFPAGE   0x0010
  End of page.
#define  R_ENDOFCELL   0x0020
  End of table cell.
#define  R_ENDOFROW   0x0040
  End of the last line of the last filled cell in a table row.
#define  R_INTABLE   0x0080
  Letter is in a table cell.
#define  R_TEXT_DIR_MASK   0x0700
  Mask of text direction in makeup field.
#define  R_TEXT_ORIENT_MASK   0x0300
  Mask of text orientation in makeup field.
#define  R_NORMTEXT   0
  Horizontal text.
#define  R_VERTTEXT   0x0100
  Vertical text (CCJK) or neon text (Latin) or upside-down barcode.
#define  R_LEFTTEXT   0x0200
  Left rotated / orientation is upward.
#define  R_RIGHTTEXT   0x0300
  Right rotated / orientation is downward.
#define  R_RTLTEXT   0x0400
  Character from a right-to-left direction word.

Space type values

Possible space types (LSPC).

#define  SPC_SPACE   0
  Real space.
#define  SPC_TAB   1
  Tabular.
#define  SPC_LEADERDOT   2
  Dot leader.
#define  SPC_LEADERLINE   3
  Line leader.
#define  SPC_LEADERHYPHEN   4
  Hyphen leader.

Macros of alternatives of the LETTER

Macros can be used for processing the alternatives of each LETTER. See also usage of usage of alternatives.

#define  GETFIRSTALTERN(stringstart, ndx)   (((const WCHAR*)((stringstart)+(ndx)))+1)
  Getting the first alternative.
#define  GETALTERNLENGTH(str)   ((str)[-1])
  Getting the length of the alternative.
#define  GETNEXTALTERN(str)   ((str)+GETALTERNLENGTH(str)+2)
  Getting the next alternative.

Defines of confidence handling of the LETTER

See confidence handling and LETTER::err.

#define  RE_SUSPECT_WORD   0x80
  The word is declared suspicious by the recognition engine if the dictionary (if any) does not contain it. This flag does not necessarily reflect whether the word is a dictionary word or not.
#define  RE_SUSPECT_THR   64
  Suspect threshold: if the lower 7 bits of LETTER::err represent a value at or above this (up to 100) it means low confidence.
#define  RE_ERROR_LEVEL_MASK   ~RE_SUSPECT_WORD
  Mask for getting the error level of the current letter.

Detailed Description

Letter handling tools.

Recognized data is stored in the current HPAGE and it is available as an array of LETTER structures providing significantly more information than the character code itself. This type of output offers the most detailed information on recognition. The information stored in a LETTER structure may belong to the character itself (character code, position, size, confidence level, font attributes, font face, choices, color) or to the word containing the character (suggestions, languages). Word-level information is set in the first LETTER of the word.

NOTE: In both the SDK and its documentation, coordinates refer to grid-coordinates - i.e. the top or left borders of pixels. Thus a rectangle does not contain the pixels according to its right and bottom coordinates.

Handling of spaces

Spaces have a special role in the text, thus their handling is also special. There are two kinds of spaces in the recognition result.

One of them is the space-like character. It really appears in the original text and it is represented with a LETTER having a space character in its code field and an LSPC structure containing information about this character. The SPACE and TAB characters and the leaders belong to this type.

The other kind of space is the dummy space. It does not appear in the original text, but it has an individual LETTER object. It indicates the end of the line only when this is also the end of the word (i.e. the last character of the line is not a hyphen). It has a role only when the User writes the recognition result directly from the LETTER array into a pure TXT file without analysing any formatting flags (e.g. font attributes, end of lines, etc.). To handle this case, a space (the dummy space) is inserted between the last word of the line and the first word of the next line.

The LETTER has size information about the represented character. However the width of the dummy space is zero, because it is in fact not in the original text.

Barcode module (BAR) has a special, binary recognition mode, when the recognition result contains binary data (not a text). (See the setting Kernel.OcrMgr.BarBinary for more information.) In this case, the content of the barcode is logically one word in one line, and the result gets a dummy space at the end only for uniformity.

The notion of word in CSDK

The last letter (maybe punctuation character or digit) of a word is the LETTER having an R_ENDOFWORD flag. The beginning of a word is the first non-space character after the previous word (or the very first item of the LETTER array). The flag R_ENDOFLINE does not play a role in determining word boundaries (e.g. hyphenation).

Special cases:

Word-related information (like the language of a word or RE_SUSPECT_WORD, etc.) is specified on all the characters of the word. The only exception is suggestion handling where suggestions are attached to the first character of the word only. (Note that suggestion handling uses a different word notion: space-separated words.)

End-position letters

The letters in ending positions are marked with particular flags. See above section for details about end of word. The end of line flag in a flowing text is generally on the above mentioned dummy space. However, if the last character of a line is a hyphen in a hyphenated word, the flag R_ENDOFLINE is put on the hyphen and the dummy space is missing from this line.

In a table the situation of end-position flags is more difficult. The next figure shows all the possible situations of the R_ENDOFLINE (L), R_ENDOFCELL (C), R_ENDOFROW (R) and R_ENDOFZONE (Z) flags in a table.

text L,C text L,C text L,C,R
text L,C more L
lines in L
a cell L,C,R
two-line L
text L,C
text L,C,R
last filled L
cell L,C,R,Z

Usage of alternatives

The common name for LETTER choices and word suggestions is 'alternatives'. You can use different alternatives similarly. They can be accessed through special WCHAR typed arrays. Every single alternative is a special string with its size in its 0th WCHAR element and an ending zero WCHAR. You can get WCHAR arrays listing of all alternatives in the recognition data - one for choices and one for suggestions. Use the functions kRecGetChoiceStr, and kRecGetSuggestionStr, respectively.

One LETTER contains an index to the list of the alternatives that points to its first alternative and has a counter with the number of its alternatives. All LETTERs can have choices (LETTER::ndxChoices), but only the first LETTER of a word refers to the suggestions (LETTER::ndxSuggestions). The scope of such a suggestion is the space-terminated word. (Note that it can differ from the end of the word notion used by spelling.)

The alternatives of a LETTER can be enumerated using the macros GETFIRSTALTERN, GETNEXTALTERN and GETALTERNLENGTH. See the following sample code on how to use them:

    RECERR err;
    HPAGE hPage;
    LETTER *pLetters;
    WCHAR *pChoices;
    LONG nLetters, choiceStrLen;

    ...
    err = kRecGetLetters(hPage, II_CURRENT, &pLetters, &nLetters);
    if (err != REC_OK)
        ... // Doing some error handling
    ...
    err = kRecGetChoiceStr(hPage, &pChoices, &choiceStrLen);
    if (err != REC_OK)
        ... // Doing some error handling
    for (LONG lettn=0; lettn<nLetters; lettn++)
    {
        ...
        const WCHAR *choice = GETFIRSTALTERN(pChoices, pLetters[lettn].ndxChoices);
        for (BYTE chon=1; chon<pLetters[lettn].cntChoices; chon++)
        {
            ... // Doing some choice handling
            choice = GETNEXTALTERN(choice);
        }
        ...
    }
    ...

Consecutive words can have the same suggestion indices - that is, the given suggestions are common to the group of the given words. This is the case when the suggestion combines two space-separated words into a single one without the space.

Since the first LETTER of a word cannot be a space, spaces do not have suggestions, but they have space information (LSPC) in the same union type (see below for more information about space handling).

Font faces can be accessed in a string of C-type strings. The LETTER indexes into this string at the first character of its font face name.


Enumeration Type Documentation

Possible places where letter array is to be copied to.

Enumerator:
LTS_FINAL 

Letters are put directly onto the input of the output conversion step.

LTS_STRONG 

Letters are put onto the strong input of the PLUS2W and PLUS3W engines.

LTS_MEDIUM 

Letters are put onto the medium input of the PLUS2W and PLUS3W engines.

LTS_WEAK 

Letters are put onto the weak input of the PLUS3W engine.

LTS_SIZE 

Number of LETTER indices (for verifying index validity).


Function Documentation

RECERR RECAPIKRN kRecFreeRecognitionData ( HPAGE  hPage )

Freeing recognition data.

The kRecFreeRecognitionData function destroys the recognized data (memory object) belonging to the hPage page.

Parameters:
[in] hPage Handle of the page having the data to be removed.
Return values:
RECERR
Note:
The effect of this call is the same as if the application had not called the kRecRecognize function.
The specification of this function in C# is:
 RECERR kRecFreeRecognitionData(IntPtr hPage); 
The specification of this function in Java is:
RECERR RECAPIKRN kRecGetChoiceStr ( HPAGE  hPage,
WCHAR **  ppChoices,
LPLONG  pLength 
)

Getting choices.

The kRecGetChoiceStr function makes the alternative letter choices data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. The retrieved data is available as an array of WCHAR structures. For more about its internal structure see the usage of alternatives. A LETTER contains the number of its choices and an index into this array on the first choice (LETTER::cntChoices, LETTER::ndxChoices).

Parameters:
[in] hPage Handle of the page whose recognized data should be accessed.
[out] ppChoices Address of a pointer variable to get the array of the recognized alternative characters and ligatures.
[out] pLength Pointer to a variable to hold the length of recognized alternative characters.
Return values:
RECERR
Note:
Since this function creates a new memory object, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetChoiceStr(IntPtr hPage, out char[] ppChoices); 
The specification of this function in Java is:
 int kRecGetChoiceStr(HPAGE hPage, Choices ppChoices) 
RECERR RECAPIKRN kRecGetFontFaceStr ( HPAGE  hPage,
char **  ppFontFaces,
LPLONG  pLength 
)

Getting font faces.

The kRecGetFontFaceStr function makes the font face data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. The retrieved data is available as an array of char strings. A LETTER contains an index into this array on its font face (LETTER::ndxFontFace).

Parameters:
[in] hPage Handle of the page whose recognized data should be accessed.
[out] ppFontFaces Address of a pointer variable to get the UTF-8 string of the recognized font faces.
[out] pLength Pointer to a variable to hold the length of recognized font face string.
Return values:
RECERR
Note:
Font face information is available only at processing PDF files with accessible text layer.
Since this function creates a new memory object, after evaluating the result, the application should call the kRecFree function to free this memory area.
The specification of this function in C# is:
 RECERR kRecGetFontFaceStr(IntPtr hPage, out char[] ppFontFaces); 
The specification of this function in Java is:
 int kRecGetFontFaceStr(HPAGE hPage, FontFaces ppFontFaces) 
RECERR RECAPIKRN kRecGetLetterPalette ( HPAGE  hPage,
REC_COLOR **  ppColours,
LPLONG  pNum 
)

Getting palette of recognition data.

This function makes the palette of the recognition data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. It contains both the foreground and background colors of the letters. The LETTER structure has indices into this array for foreground and background colors (LETTER::ndxFGColor, LETTER::ndxBGColor).

Parameters:
[in] hPage Handle of the page whose recognized data should be accessed.
[out] ppColours Address of a pointer variable to get the address of the palette array.
[out] pNum Pointer to a variable to hold the number of colors in palette.
Return values:
RECERR
Note:
Palette can contain the special REC_COLOR values REC_DEFAULT_COLOR and REC_UNDEF_COLOR. Background color can be both, they mean white. Foreground color can be REC_DEFAULT_COLOR, which means black.
Since this function creates a new memory object, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetLetterPalette(IntPtr hPage, out uint[] ppColours); 
The specification of this function in Java is:
 int kRecGetLetterPalette(HPAGE hPage, RecColorArray ppColours) 
RECERR RECAPIKRN kRecGetLetters ( HPAGE  hPage,
IMAGEINDEX  iiImage,
LPLETTER ppLetter,
LPLONG  pLettersLength 
)

Getting recognition result.

The kRecGetLetters function makes the recognition data belonging to the hPage page available to the application by creating a new memory object containing the recognized data. This function can be called after a successful kRecRecognize call. The recognized data is available as an array of LETTER structures.

Parameters:
[in] hPage Handle of the page whose recognized data should be accessed.
[in] iiImage Index of the image in the page, in which the coordinates are needed to be given.
[out] ppLetter Address of a pointer variable to get the address of the recognized characters.
[out] pLettersLength Pointer to a variable to hold the number of recognized characters.
Return values:
RECERR
Note:
Since this function creates a new memory object containing the recognized data, the application should call the kRecFree function to free this memory area after evaluating the result.
The specification of this function in C# is:
 RECERR kRecGetLetters(IntPtr hPage, IMAGEINDEX iiImage, out LETTER[] ppLetter); 
The specification of this function in Java is:
 int kRecGetLetters(HPAGE hPage, IMAGEINDEX iiImage, LetterArray ppLetter) 
RECERR RECAPIKRN kRecGetSuggestionStr ( HPAGE  hPage,
WCHAR **  ppSuggestions,
LPLONG  pLength 
)

Getting suggestions.

The kRecGetSuggestionStr function makes the word suggestions data belonging to the hPage page available to the application by creating a new memory object. This function can be called after a successful kRecRecognize call. The retrieved data is available as an array of WCHAR structures. For more about its internal structure see the usage of alternatives. The first LETTER of a word contains the number of word choices and an index into this array on the first suggestion (LETTER::cntSuggestions, LETTER::ndxSuggestions).

Parameters:
[in] hPage Handle of the page whose recognized data should be accessed.
[out] ppSuggestions Address of a pointer variable to get the array of the recognized suggestions.
[out] pLength Pointer to a variable to hold the length of recognized suggestions.
Return values:
RECERR
Note:
Since this function creates a new memory object, the application should call the kRecFree function to free this memory area after evaluating the result.
If the letter is a space, it does not have suggestions, but only space info (see LETTER::spcInfo and LSPC).
The specification of this function in C# is:
 RECERR kRecGetSuggestionStr(IntPtr hPage, out char[] ppSuggestions); 
The specification of this function in Java is:
 int kRecGetSuggestionStr(HPAGE hPage, Suggestions ppSuggestions) 
RECERR RECAPIKRN kRecSetLetters ( LETTERSTRENGTH  towhere,
HPAGE  hPage,
IMAGEINDEX  iiImage,
LPCLETTER  pLetter,
LONG  LettersLength 
)

Putting a letter buffer onto the input of the PLUS2W and PLUS3W engines or the selected output converter.

This function can affect the recognition results and/or the content of the output file. The PLUS modules are voting engines combining results of two or three other OCR engines. The voting method of RM_OMNIFONT_PLUS2W has strong and medium inputs, RM_OMNIFONT_PLUS3W uses an additional weak one as well. You can replace one input (parameter towhere) with your alternative engine result by calling the function kRecSetLetters. The voting method uses your letter buffer as it generates the final OCR result. Stronger input may have greater effect on the recognition result, so you should consider which level you select for your letter buffer.

Passing the letter buffer on the level LTS_FINAL the OCR method does not run, because in this level kRecSetLetters works similarly as in previous versions of CSDK, i.e. the letters are given directly to the input of the selected output converter.

Parameters:
[in] towhere This parameter specifies one of the three possible inputs of the Voting Engine, on which the engine receives the letter buffer.
[in] hPage Handle of the HPAGE the Voting Engine works on.
[in] iiImage Index of the image in the page whose coordinate system you have used in defining the boundary box for LETTER.
[in] pLetter The letter buffer to be given to the engine.
[in] LettersLength Size of the letter buffer.
Return values:
RECERR
Note:
After putting letters on the selected levels (even on LTS_FINAL), you should call kRecRecognize.
More than one input way of the PLUS engines can be replaced with subsequent calls for kRecSetLetters. Even in such a case, kRecRecognize should be called only once.
The following fields of input LETTERs are unused during kRecRecognize:cntChoices, ndxChoices, cntSuggestions, ndxSuggestions, reserved_b, ndxFGColor, ndxBGColor, ndxFontFace, ndxExt and OCRENGINE bits (RH_OCRENGINE_MASK) of info. These fields are cleared and the original order of LETTERs may be altered after using this function.
The specification of this function in C# is:
 RECERR kRecSetLetters(LETTERSTRENGTH towhere, IntPtr hPage, IMAGEINDEX iiImg, LETTER[] lpLetter); 
The specification of this function in Java is:
 int kRecSetLetters(LETTERSTRENGTH towhere, HPAGE hPage, IMAGEINDEX iiImage, LETTER[] pLetter)