Test classification

You can use the Test Classification feature to ensure each document type is properly trained to classify and separate multiple pages or multiple documents within a single uploaded file. All files are processed according to the settings in the batch class. All plugins required for the DOCUMENT_ASSEMBLER to execute a particular configuration must be present in the batch class workflow. Classification will default to the option selected in the DOCUMENT_ASSEMBLER plugin.

Perform test classification

To test document type classification:

  1. From the Batch Class Management screen, select your batch class and click Open.
  2. Go to Document Types.
  3. Click Upload Test Classification Files.
  4. Select your test classification files and click Open.

    This will upload the files to Transact.

  5. Click Test Classification.
  6. Enable or disable Workflow according to your requirements.
    • ON: Classification Types are disabled and test classification results are generated using the batch class configurations. This is the default option.

    • OFF: Results are generated based on the selected Classification Types.

  7. Click Classify.
  8. Review the results to verify that each document type has been separated correctly. Transact highlights the document or page rows where the confidence is lower than the confidence threshold for the document. This indicates a document that requires manual validation in the Review or Validate modules.

    The classification results are shown in a tabular form with the following details about the input images:

    • Classification Type: The type of classification performed on the set of inputs.

    • Document Type: The type of document classified for this input page.

    • Document Identifier: The identifier of the document classified.

    • Document Confidence: The confidence value for this document generated.

    • Page Name: The name of the page is given as input for classification.

    • Page Identifier: The identifier of the page in accordance with the inputs provided.

    • Page Classification: The classification done for the page. This contains the name of the document type for which the image is classified with the name of the page type such as FIRST_PAGE, MIDDLE_PAGE or LAST_PAGE.

    • Page Confidence: The confidence generated for this page after classification.

    • Classification Sample: The name of the classification/learned file name from the lucene-search-classification-sample folders. This value is only populated in the case of Search Classification. For all other classification methods, it is set as "NA".

  9. If required, click Download to download the resulting batch.xml structured output, which contains all details of processing.

    To remove all input files and the results generated from test-classification-folder, click Clear.