Initialize and configure the layout classifier

Use the procedure in this topic to initialize the layout classifier to use layout classification in your project.

The layout classifier is not initialized if the Reset Delete icon button is disabled.

When you first create a project, the project has no layout classifier. This is true even if some classes have layout classification enabled. Layout classification is used for classes with documents that have a consistent layout. Forms and invoices are examples of documents successfully classified using layout classification.

Each project has one layout classifier. Any class that uses layout classification uses this classifier. Configure the layout classifier so it is suitable for your entire project.

Procedure

  1. On the Project tab, in the Configuration group, select Project Settings Project Settings icon.
  2. Select the Classification tab to view the classification settings.
  3. In the Layout Classification group, select the Properties Properties icon button for the Layout Classifier.

    The Layout Classifier Properties window is displayed.

  4. From the Optimize Classification for setting, select one of the following values:
    Invoices:

    If this setting is selected, the classifier analyzes only the upper and lower parts of the document. The remainder of the document is not used for classification. This is especially useful for invoices, because they often have a preprinted header and footer area. It may also apply for other types of business documents that have a similar structure. (Default: Selected)

    Forms:

    If this setting is selected, the classifier uses the entire region of the image. This can be used for forms and other types of documents that have a fixed layout over the entire region of the image.

  5. To access additional settings, select Advanced.

    Additional settings are available.

  6. Configure the following advanced settings:

    Enable skew tolerance

    This setting cannot be used if the processed documents are already deskewed by some other application. For example, when using VRS during scanning, there is no need to select this setting because VRS adjusts skewed images automatically. (Default: Selected)

    Max. samples per class

    The Layout Classifier supports an unlimited number of samples per class. If the sample images are very different, the Layout Classifier internally learns different patterns for each sample. For performance reasons, you may want to limit the number of sample documents that are used for feature extraction. A value of 0 means no limitation. (Default: 0)

    Class homogeneity

    This feature controls how sensitive the classifier is to variations in the layout of the images in the training set. If the sample images are very different, the Layout Classifier automatically creates internal patterns for each new type. These types are not visible to the user. (Default: 80.0)

    The more types, the better the classification accuracy, but the classification speed is slowed for each additional type. The value set by this control is a threshold that determines when new internal types are created. In most cases the default value works the best.

    Noise Filter

    This feature controls how to match regions with low contrast. For example, images that have a fine background pattern. (Default: 15.0)

    A value closer to "max. precision" does not classify images with low contrast. This means that even documents from the training set do not have 100% confidence. The probability of getting misclassified documents is less likely, resulting in higher accuracy, but there are more rejects.

    A value closer to "max. recall" returns higher confidence values for documents with low contrast. However, this may mean high confidence values are determined for other classes with low contrast in the same region of the document, and may lead to a higher error rate. In most cases the default value works best.

  7. Select OK to save your changes and close the Layout Classifier Properties window.

    The Layout Classifier Properties window is closed, the layout classifier is initialized, and the Reset button is enabled.

  8. Optional. Select OK to close the Project Settings window.
  9. Save the changes to your project.
  10. Optional. Test your classification settings.