RecAPI
All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Properties Modules Pages
Document Classifier Module

Document Classifier API. More...

Classes

struct  CLASSIFY_INFO
 Structure for information about classification. More...
 

Typedefs

typedef struct RECDCSTRUCT * DCHANDLE
 Handle of a Document Classifier object.
 

Functions

RECERR RECAPIKRN kRecOpenDCProject (int sid, LPCTSTR pDCProjectFile, DCHANDLE *phDCProject)
 Opening Document Classifier Project File.
 
RECERR RECAPIKRN kRecCloseDCProject (DCHANDLE hDCProject)
 Closing a Document Classifier Project.
 
RECERR RECAPIKRN kRecGetFirstDCClass (DCHANDLE hDCProject, DCHANDLE *phDCClass)
 Starting enumeration of Document Classes.
 
RECERR RECAPIKRN kRecGetNextDCClass (DCHANDLE hDCPrevClass, DCHANDLE *phDCClass)
 Performing enumeration of Document Classes.
 
RECERR RECAPIKRN kRecClassifyPage (int sid, DCHANDLE hDCProject, HPAGE hPage, DCHANDLE *phDCPredictedClass, unsigned *pConfidenceLevel, CLASSIFY_INFO **pClassifyInfo, LPLONG pLength, INTBOOL *pIsConfident)
 Classifying a page.
 
RECERR RECAPIKRN kRecClassifyText (int sid, DCHANDLE hDCProject, LPCTSTR pText, DCHANDLE *phDCPredictedClass, unsigned *pConfidenceLevel, CLASSIFY_INFO **pClassifyInfo, LPLONG pLength, INTBOOL *pIsConfident)
 Classifying text.
 
RECERR RECAPIKRN kRecClassifyDocument (int sid, DCHANDLE hDCProject, LPCTSTR pFileName, int iPage, DCHANDLE *phDCPredictedClass, unsigned *pConfidenceLevel, CLASSIFY_INFO **pClassifyInfo, LPLONG pLength, INTBOOL *pIsConfident)
 Classifying the given page of a document.
 
RECERR RECAPIKRN kRecGetDCClassName (DCHANDLE hDCClass, LPTSTR *ppName)
 Returning the name of a Document Class.
 
RECERR RECAPIKRN kRecSetDCConfidenceThreshold (DCHANDLE hDCProject, int ConfidenceThreshold)
 Set the confidence threshold of a Document Classifier Project.
 
RECERR RECAPIKRN kRecGetDCConfidenceThreshold (DCHANDLE hDCProject, int *pConfidenceThreshold)
 Get the confidence threshold of a Document Classifier Project.
 

Detailed Description

Document Classifier API.

For detailed description of this module see its separated documentation https://docshield.tungstenautomation.com/OmniPageCaptureSDK/en_US/2025.1.0-m7NwYtqyAo/help/OmniPageCapture_SDKdocumentclassificationassistant/c_Welcome.html.

Function Documentation

◆ kRecClassifyDocument()

RECERR RECAPIKRN kRecClassifyDocument ( int sid,
DCHANDLE hDCProject,
LPCTSTR pFileName,
int iPage,
DCHANDLE * phDCPredictedClass,
unsigned * pConfidenceLevel,
CLASSIFY_INFO ** pClassifyInfo,
LPLONG pLength,
INTBOOL * pIsConfident )

Classifying the given page of a document.

This function classifies a document or the given page of the document. The document can contain scanned pages, one page from a PDF file or plain text.

Parameters
[in]sidSettings Collection ID.
[in]hDCProjectHandle of the Document Classifier Project returned by kRecOpenDCProject.
[in]pFileNameName of the file containing the document. It can be image file, PDF or text file.
[in]iPageThe page number of the page to be processed. This parameter is not used if the input file is text file.
[out]phDCPredictedClassAddress of a variable to store the handle of the predicted Document Class. The returned handle can be NULL.
[out]pConfidenceLevelAddress of a variable to store the confidence of the prediction. The returned value is between 0 and 100.
[out]pClassifyInfoAddress of a variable to store info about classifying.
[out]pLengthAddress of a variable to store the length of the array returned in pClassifyInfo. This is equal to the number of classes.
[out]pIsConfidentAddress of a variable to return if the classification is confident. The returned value is TRUE if the confidence of the prediction is greater than or equal to the preset confidence threshold.
Return values
RECERR
Note
Use kRecOpenDCProject to open a Document Classifier Project File and obtain a handle.
This function decides if the input is an image file, PDF or text file, based on the filename extension. DC_UNKNOWNEXTENSION_ERR is returned if the extension is unknown.
If the input is an image file or PDF, the function loads and preprocesses it. If text based classification is enabled, the image is recognized as well.
If the input is a text file (i.e. the filename extension is .txt), only text based classification is possible. The program supports the following text encodings: Unicode (both UTF-16 and UTF-8, with or without Byte Order Mark) and non-Unicode text encoded with Windows default codepage (as set in the Control Panel > Region and Language > Administrative pane > Change system locale).
The function returns the handle of the predicted class, and the confidence of the prediction. You can query the name of the class with kRecGetDCClassName. The function returns an array of CLASSIFY_INFO structures (pClassifyInfo). The length of the array is equal to the number of defined classes, and returned in pLength. The array contains the confidence levels for each class. The confidence threshold can be defined with Document Classifier Assistant. It is stored in the Document Classifier Project File, and can be queried (kRecGetDCConfidenceThreshold) and changed (kRecSetDCConfidenceThreshold) after the project is loaded.
The array returned in pClassifyInfo should be released using kRecFree.
The specification of this function in C# is:
RECERR kRecClassifyDocument(int sid, IntPtr hDCProject, string pFileName, int nPage, out IntPtr phDCBestClass, out UInt32 confidence, out CLASSIFY_INFO[] pClassifyInfo, out bool isConfident);
RECERR RECAPIKRN kRecClassifyDocument(int sid, DCHANDLE hDCProject, LPCTSTR pFileName, int iPage, DCHANDLE *phDCPredictedClass, unsigned *pConfidenceLevel, CLASSIFY_INFO **pClassifyInfo, LPLONG pLength, INTBOOL *pIsConfident)
Classifying the given page of a document.
RECERR
Error codes.
Definition RECERR_doc.h:19
Structure for information about classification.
Definition KernelApi.h:13653
The specification of this function in Java is:
int kRecClassifyDocument(int sid, DCHANDLE hDCProject, String pFileName, int iPage, DCHANDLE phDCPredictedClass, long[] pConfidenceLevel, ClassifyInfoArray pClassifyInfo, int[] pIsConfident)
struct RECDCSTRUCT * DCHANDLE
Handle of a Document Classifier object.
Definition KernelApi.h:13646
The specification of this function in Python is:
def kRecClassifyDocument(sid: int, hDCProject: "DCHANDLE", pFileName: str, iPage: int) -> Tuple[int, "DCHANDLE", int, "ClassifyInfoArray", bool]

◆ kRecClassifyPage()

RECERR RECAPIKRN kRecClassifyPage ( int sid,
DCHANDLE hDCProject,
HPAGE hPage,
DCHANDLE * phDCPredictedClass,
unsigned * pConfidenceLevel,
CLASSIFY_INFO ** pClassifyInfo,
LPLONG pLength,
INTBOOL * pIsConfident )

Classifying a page.

This function classifies the given HPAGE.

Parameters
[in]sidSettings Collection ID.
[in]hDCProjectHandle of the Document Classifier Project returned by kRecOpenDCProject.
[in]hPageHandle of the page to be classified.
[out]phDCPredictedClassAddress of a variable to store the handle of the predicted Document Class. The returned handle can be NULL.
[out]pConfidenceLevelAddress of a variable to store the confidence of the prediction. The returned value is between 0 and 100.
[out]pClassifyInfoAddress of a variable to store info about classifying.
[out]pLengthAddress of a variable to store the length of the array returned in pClassifyInfo. This is equal to the number of classes.
[out]pIsConfidentAddress of a variable to return if the classification is confident. The returned value is TRUE if the confidence of the prediction is greater than or equal to the preset confidence threshold.
Return values
RECERR
Note
If the classifier method (defined in Document Classifier Project) is Text or Combined, the function recognizes the image unless the hPage contains letters at the entry. The langauge of the recognition is defined in the Document Classifier Project. Upon returning hPage contains the result of recognition (OCR zones, letters).
The function returns the handle of the predicted class, and the confidence of the prediction. The function returns an array of CLASSIFY_INFO structures (pClassifyInfo). The length of the array is equal to the number of defined classes, and returned in pLength. The array contains the confidence levels for each class. The confidence threshold can be defined with Document Classifier Assistant. It is stored in the Document Classifier Project File, and can be queried (kRecGetDCConfidenceThreshold) and changed (kRecSetDCConfidenceThreshold) after the project is loaded.
The array returned in pClassifyInfo should be released using kRecFree.
The specification of this function in C# is:
RECERR kRecClassifyPage(int sid, IntPtr hDCProject, IntPtr hPage, out IntPtr bestClass, out UInt32 confidence, out CLASSIFY_INFO[] pClassifyInfos, out bool isClassfied);
RECERR RECAPIKRN kRecClassifyPage(int sid, DCHANDLE hDCProject, HPAGE hPage, DCHANDLE *phDCPredictedClass, unsigned *pConfidenceLevel, CLASSIFY_INFO **pClassifyInfo, LPLONG pLength, INTBOOL *pIsConfident)
Classifying a page.
The specification of this function in Java is:
int kRecClassifyPage(int sid, DCHANDLE hDCProject, HPAGE hPage, DCHANDLE phDCPredictedClass, long[] pConfidenceLevel, ClassifyInfoArray pClassifyInfo, int[] pIsConfident)
struct RECPAGESTRUCT * HPAGE
Handle of a page in memory.
Definition KernelApi.h:289
The specification of this function in Python is:
def kRecClassifyPage(sid: int, hDCProject: "DCHANDLE", hPage: "HPAGE") -> Tuple[int, "DCHANDLE", int, "ClassifyInfoArray", bool]

◆ kRecClassifyText()

RECERR RECAPIKRN kRecClassifyText ( int sid,
DCHANDLE hDCProject,
LPCTSTR pText,
DCHANDLE * phDCPredictedClass,
unsigned * pConfidenceLevel,
CLASSIFY_INFO ** pClassifyInfo,
LPLONG pLength,
INTBOOL * pIsConfident )

Classifying text.

This function classifies the given text.

Parameters
[in]sidSettings Collection ID.
[in]hDCProjectHandle of the Document Classifier Project returned by kRecOpenDCProject.
[in]pTextNULL terminated text to be classified.
[out]phDCPredictedClassAddress of a variable to store the handle of the predicted Document Class. The returned handle can be NULL.
[out]pConfidenceLevelAddress of a variable to store the confidence of the prediction. The returned value is between 0 and 100.
[out]pClassifyInfoAddress of a variable to store info about classifying.
[out]pLengthAddress of a variable to store the length of the array returned in pClassifyInfo. This is equal to the number of classes.
[out]pIsConfidentAddress of a variable to return if the classification is confident. The returned value is TRUE if the confidence of the prediction is greater than or equal to the preset confidence threshold.
Return values
RECERR
Note
Use kRecOpenDCProject to open a Document Classifier Project File and obtain a handle.
The function returns the handle of the predicted class, and the confidence of the prediction. The function returns an array of CLASSIFY_INFO structures (pClassifyInfo). The length of the array is equal to the number of defined classes, and returned in pLength. The array contains the confidence levels for each class. The confidence threshold can be defined with Document Classifier Assistant. It is stored in the Document Classifier Project File, and can be queried (kRecGetDCConfidenceThreshold) and changed (kRecSetDCConfidenceThreshold) after the project is loaded.
The array returned in pClassifyInfo should be released using kRecFree.
The specification of this function in C# is:
RECERR kRecClassifyText(int sid, IntPtr hDCProject, string pText, out IntPtr phDCBestClass, out UInt32 confidence, out CLASSIFY_INFO[] pClassifyInfo, out bool isConfident);
RECERR RECAPIKRN kRecClassifyText(int sid, DCHANDLE hDCProject, LPCTSTR pText, DCHANDLE *phDCPredictedClass, unsigned *pConfidenceLevel, CLASSIFY_INFO **pClassifyInfo, LPLONG pLength, INTBOOL *pIsConfident)
Classifying text.
The specification of this function in Java is:
int kRecClassifyText(int sid, DCHANDLE hDCProject, String pText, DCHANDLE phDCPredictedClass, long[] pConfidenceLevel, ClassifyInfoArray pClassifyInfo, int[] pIsConfident)
The specification of this function in Python is:
def kRecClassifyText(sid: int, hDCProject: "DCHANDLE", pText: str) -> Tuple[int, "DCHANDLE", int, "ClassifyInfoArray", bool]

◆ kRecCloseDCProject()

RECERR RECAPIKRN kRecCloseDCProject ( DCHANDLE hDCProject)

Closing a Document Classifier Project.

This function closes a Document Classifier Project opened by kRecOpenDCProject.

Parameters
[in]hDCProjectHandle of the Document Classifier Project.
Return values
RECERR
Note
The specification of this function in C# is:
RECERR kRecCloseDCProject(IntPtr hDCProject);
RECERR RECAPIKRN kRecCloseDCProject(DCHANDLE hDCProject)
Closing a Document Classifier Project.
The specification of this function in Java is:
int kRecCloseDCProject(DCHANDLE hDCProject)
The specification of this function in Python is:
def kRecCloseDCProject(hDCProject: "DCHANDLE") -> int

◆ kRecGetDCClassName()

RECERR RECAPIKRN kRecGetDCClassName ( DCHANDLE hDCClass,
LPTSTR * ppName )

Returning the name of a Document Class.

This function returns the name of a Document Class.

Parameters
[in]hDCClassHandle of the Document Class.
[out]ppNameAddress of a variable to store the name of the Document Class.
Return values
RECERR
Note
Use this function to obtain the name of the Document Class.
The specification of this function in C# is:
RECERR kRecGetDCClassName(IntPtr hDCClass, out string ppName)
RECERR RECAPIKRN kRecGetDCClassName(DCHANDLE hDCClass, LPTSTR *ppName)
Returning the name of a Document Class.
The specification of this function in Java is:
int kRecGetDCClassName(DCHANDLE hDCClass, String[] ppName)
The specification of this function in Python is:
def kRecGetDCClassName(hDCClass: "DCHANDLE") -> Tuple[int, str]

◆ kRecGetDCConfidenceThreshold()

RECERR RECAPIKRN kRecGetDCConfidenceThreshold ( DCHANDLE hDCProject,
int * pConfidenceThreshold )

Get the confidence threshold of a Document Classifier Project.

The kRecGetDCConfidenceThreshold returns the confidence threshold of the given Document Classifier Project.

Parameters
[in]hDCProjectHandle of the Document Classifier Project returned by kRecOpenDCProject.
[out]pConfidenceThresholdAddress of an integer variable to get the confidence threshold.
Note
The confidence threshold is a number between 0 and 100. It can be set with Document Classifier Assistant during the Training and Testing Process, and stored in Document Classifier Project File. The threshold can be queried and changed after the Document Classifier Project File is loaded.
The specification of this function in C# is:
RECERR kRecGetDCConfidenceThreshold(IntPtr hDCProject, out int ConfidenceThreshold);
RECERR RECAPIKRN kRecGetDCConfidenceThreshold(DCHANDLE hDCProject, int *pConfidenceThreshold)
Get the confidence threshold of a Document Classifier Project.
The specification of this function in Java is:
int kRecGetDCConfidenceThreshold(DCHANDLE hDCProject, int[] pConfidenceThreshold)
The specification of this function in Python is:
def kRecGetDCConfidenceThreshold(hDCProject: "DCHANDLE") -> Tuple[int, int]

◆ kRecGetFirstDCClass()

RECERR RECAPIKRN kRecGetFirstDCClass ( DCHANDLE hDCProject,
DCHANDLE * phDCClass )

Starting enumeration of Document Classes.

This function returns the handle of the first Document Class of the given project.

Parameters
[in]hDCProjectHandle of the Document Classifier Project.
[out]phDCClassAddress of a variable to store the handle of the first Document Class.
Return values
RECERR
Note
The Document Classes can be queried using the kRecGetFirstDCClass and kRecGetNextDCClass function-pair.
The name of the class can be queried by kRecGetDCClassName().
The specification of this function in C# is:
RECERR kRecGetFirstDCClass(IntPtr hDCProject, out IntPtr hDCClass);
RECERR RECAPIKRN kRecGetFirstDCClass(DCHANDLE hDCProject, DCHANDLE *phDCClass)
Starting enumeration of Document Classes.
The specification of this function in Java is:
int kRecGetFirstDCClass(DCHANDLE hDCProject, DCHANDLE phDCClass)
The specification of this function in Python is:
def kRecGetFirstDCClass(hDCProject: "DCHANDLE") -> Tuple[int, "DCHANDLE"]

◆ kRecGetNextDCClass()

RECERR RECAPIKRN kRecGetNextDCClass ( DCHANDLE hDCPrevClass,
DCHANDLE * phDCClass )

Performing enumeration of Document Classes.

This function returns the handle of the next Document Class of the given project.

Parameters
[in]hDCPrevClassHandle of the previous Document Class.
[out]phDCClassAddress of a variable to store the handle of the next Document Class.
Return values
RECERR
Note
The Document Classes can be queried using the kRecGetFirstDCClass and kRecGetNextDCClass function-pair.
The name of the class can be queried by kRecGetDCClassName().
The specification of this function in C# is:
RECERR kRecGetNextDCClass(IntPtr hDCPrevClass, out IntPtr hDCClass);
RECERR RECAPIKRN kRecGetNextDCClass(DCHANDLE hDCPrevClass, DCHANDLE *phDCClass)
Performing enumeration of Document Classes.
The specification of this function in Java is:
int kRecGetNextDCClass(DCHANDLE hDCPrevClass, DCHANDLE phDCClass)
The specification of this function in Python is:
def kRecGetNextDCClass(hDCPrevClass: "DCHANDLE") -> Tuple[int, "DCHANDLE"]

◆ kRecOpenDCProject()

RECERR RECAPIKRN kRecOpenDCProject ( int sid,
LPCTSTR pDCProjectFile,
DCHANDLE * phDCProject )

Opening Document Classifier Project File.

The kRecOpenDCProject opens a Document Classifier Project File (*.dcp).

Parameters
[in]sidSettings Collection ID.
[in]pDCProjectFilePath to the Project File.
[out]phDCProjectAddress of a variable to store the handle of the Document Classifier Project.
Return values
RECERR
Note
Use the Document Classifier Assistant to create, train and test a Document Classifier Project. Document Classifier Assistant lets you define classes, add training and test documents to the classes, train and test the document classifier. After Training and Testing Process you can export a Document Classifier Project File, which contains all the necessary information to perform classification. CSDK provides API (Document Classifier API) for loading the Document Classifier Project File and classify documents.
If the project is no longer needed it should be closed by invoking the kRecCloseDCProject function.
The specification of this function in C# is:
RECERR kRecOpenDCProject(int sid, string pDCProjectFile, out IntPtr hDCProject);
RECERR RECAPIKRN kRecOpenDCProject(int sid, LPCTSTR pDCProjectFile, DCHANDLE *phDCProject)
Opening Document Classifier Project File.
The specification of this function in Java is:
int kRecOpenDCProject(int sid, String pDCProjectFile, DCHANDLE hDCProject)
The specification of this function in Python is:
def kRecOpenDCProject(sid: int, pDCProjectFile: str) -> Tuple[int, "DCHANDLE"]

◆ kRecSetDCConfidenceThreshold()

RECERR RECAPIKRN kRecSetDCConfidenceThreshold ( DCHANDLE hDCProject,
int ConfidenceThreshold )

Set the confidence threshold of a Document Classifier Project.

The kRecSetDCConfidenceThreshold sets the confidence threshold of the given Document Classifier Project.

Parameters
[in]hDCProjectHandle of the Document Classifier Project returned by kRecOpenDCProject.
[in]ConfidenceThresholdThe value of the current confidence threshold;
Note
The confidence threshold is a number between 0 and 100. It can be set with Document Classifier Assistant during the Training and Testing Process, and stored in Document Classifier Project File. The threshold can be queried and changed during the after the Document Classifier Project File is loaded.
The specification of this function in C# is:
RECERR kRecSetDCConfidenceThreshold(IntPtr hDCProject, int ConfidenceThreshold);
RECERR RECAPIKRN kRecSetDCConfidenceThreshold(DCHANDLE hDCProject, int ConfidenceThreshold)
Set the confidence threshold of a Document Classifier Project.
The specification of this function in Java is:
int kRecSetDCConfidenceThreshold(DCHANDLE hDCProject, int ConfidenceThreshold)
The specification of this function in Python is:
def kRecSetDCConfidenceThreshold(hDCProject: "DCHANDLE", ConfidenceThreshold: int) -> int