Initialize and configure the layout classifier
Use the procedure in this topic to initialize the layout classifier to use layout classification in your project.
The layout classifier is not initialized if the
Reset
button is disabled.
When you first create a project, the project has no layout classifier. This is true even if some classes have layout classification enabled. Layout classification is used for classes with documents that have a consistent layout. Forms and invoices are examples of documents successfully classified using layout classification.
Each project has one layout classifier. Any class that uses layout classification uses this classifier. Configure the layout classifier so it is suitable for your entire project.
Procedure
-
On the
Project tab, in the
Configuration group, select
Project Settings
.
- Select the Classification tab to view the classification settings.
-
In the
Layout Classification group, select the
Properties
button for the
Layout Classifier.
The Layout Classifier Properties window is displayed.
-
From the
Optimize Classification for setting, select one of the following values:
- Invoices:
-
If this setting is selected, the classifier analyzes only the upper and lower parts of the document. The remainder of the document is not used for classification. This is especially useful for invoices, because they often have a preprinted header and footer area. It may also apply for other types of business documents that have a similar structure. (Default: Selected)
- Forms:
-
If this setting is selected, the classifier uses the entire region of the image. This can be used for forms and other types of documents that have a fixed layout over the entire region of the image.
-
To access additional settings, select
Advanced.
Additional settings are available.
-
Configure the following advanced settings:
Enable skew tolerance
This setting cannot be used if the processed documents are already deskewed by some other application. For example, when using VRS during scanning, there is no need to select this setting because VRS adjusts skewed images automatically. (Default: Selected)
Max. samples per class
The Layout Classifier supports an unlimited number of samples per class. If the sample images are very different, the Layout Classifier internally learns different patterns for each sample. For performance reasons, you may want to limit the number of sample documents that are used for feature extraction. A value of 0 means no limitation. (Default: 0)
Class homogeneity
This feature controls how sensitive the classifier is to variations in the layout of the images in the training set. If the sample images are very different, the Layout Classifier automatically creates internal patterns for each new type. These types are not visible to the user. (Default: 80.0)
The more types, the better the classification accuracy, but the classification speed is slowed for each additional type. The value set by this control is a threshold that determines when new internal types are created. In most cases the default value works the best.
Noise Filter
This feature controls how to match regions with low contrast. For example, images that have a fine background pattern. (Default: 15.0)
A value closer to "max. precision" does not classify images with low contrast. This means that even documents from the training set do not have 100% confidence. The probability of getting misclassified documents is less likely, resulting in higher accuracy, but there are more rejects.
A value closer to "max. recall" returns higher confidence values for documents with low contrast. However, this may mean high confidence values are determined for other classes with low contrast in the same region of the document, and may lead to a higher error rate. In most cases the default value works best.
-
Select
OK to save your changes and close the
Layout Classifier Properties window.
The Layout Classifier Properties window is closed, the layout classifier is initialized, and the Reset button is enabled.
- Optional. Select OK to close the Project Settings window.
- Save the changes to your project.
- Optional. Test your classification settings.