Machine Learning classification and extraction roles

In Transact, machine learning classification and extraction are handled separately. The Classification Roles and Extraction Roles columns are located on the Document Types screen. Now, the administrator can separately assign the roles to perform machine learning classification and machine learning extraction. This feature is available in both Windows and Linux environments.

Roles selected for a Batch Class on the Batch Class Management screen are inherited by the Document Types added under this Batch Class. You can then edit the default roles populated for Classification and Extraction as required on the Document Types screen.

When the Batch Class is copied, all the roles defined at the Batch Class level and the Document Type level are also copied.

When the Batch Class is exported, all the information about defined roles is also exported.

When the Batch Class is imported, you can select whether to import it with or without defined roles using the check box Roles in the Import Batch Class popup window. If you select this check box, the Batch Class is imported along with the roles. In this case, Batch Class roles are inherited by the Document Types as well. If you leave the check box cleared, the Batch Class is imported without roles.

In the case of Document Type, the assigned roles for Classification and Extraction are also copied, imported, and exported.

Assign roles for machine learning classification and extraction

To assign roles for machine learning classification and extraction, the administrator needs to make the following configurations:

  1. Create a Batch Class and select required Roles from the Roles drop-down list on the Batch Class Management screen.
  2. Navigate to the Extraction module, add the MACHINE_LEARNING_BASED_EXTRACTON plugin and turn on the Machine Learning Based Extraction Switch.
  3. Navigate to the Document Types screen and create a new Document Type.

    The Classification and Extraction roles fields are automatically populated with the Roles assigned at the Batch Class level on the Batch Class Management screen.

  4. Change Classification and Extraction roles as required.

Next, consider several scenarios where:

  • Different Roles are assigned for machine learning classification and extraction.

  • The same Roles are assigned for machine learning classification and extraction.

  • Roles are assigned at the Document Type level, but no roles are assigned at the Batch Class level.

Different Roles assigned for machine learning classification and extraction

Suppose there are three Roles configured in the application (Role1, Role2, and Role3) and the administrator assigns only Role1 and Role2 at the Batch Class level.

In this case, the user with Role3 does not see the Batch Class or the batch instances of that Batch Class unless there is a custom script running which gives the user permission to work with a specific batch or Batch Class.

Suppose the administrator has assigned Role1 for classification and Role2 for extraction at the Document Type level.

In this case, the user with Role2 is not able to perform classification and the user with Role1 is not able to perform validation. However, both users can view the batch instance in the Review and Validate states.

All the users assigned at the Batch Class level can upload the batch and only the user with permission "classification/extraction" can perform the operation. This is completely dependent on the roles.

Based on the above configurations, let us now log in as an operator with Role1 and Role2 and verify the results.

  1. Log in as the user with Role1 and upload a batch.
  2. Select the Batch Class and upload the batch by clicking on the Select Files hyperlink.
  3. Click the Start Batch button to initiate the batch processing workflow.
  4. Go to the Batch List.

    The uploaded batch instance appears in the Review state if the classification is not performed properly.

    If the classification is executed properly, the batch instance goes directly to the Validation state. The batch instance will be stuck on the Validation screen if you have applied Force Review or if any of the index fields are not extracted properly.

  5. If the classification is not performed properly, the user with Role1 must perform machine learning to re-learn the documents.
    1. Select the Document Type for which you want to process the document.
    2. Click Learn Files from the More drop-down list.
    3. Assign First, Middle or Last Pages (if required) and click the Learn Files button to confirm the machine learning.

    The document is re-learned and if the user processes the same document again, the document is classified into the defined Document Type.

    Suppose this batch also requires machine learning extraction as some of the index fields are not extracted properly. In this case, the status of the batch changes to Ready for Validation and now the user can see it in the Validation section of the Batch List screen.

    If the user with Role1 tries to proceed with extraction, the following message appears: "Machine Learning of a document is not allowed for Current User". It happens because this user has the permission to do only machine learning classification of the document.

  6. Now, log in as a user with Role 2 and navigate to the Batch List screen.

    The batch instance created by the user with Role1 is now ready for validation.

  7. Proceed to the Validation screen and perform machine learning extraction.

    If required, machine learning can be used for table extraction.

  8. Click the Validate button and confirm that validation has been completed.

The index fields are re-learned and the batch is processed. If the user processes the same document again, the values is successfully extracted.

If the user with Role2 tries to do machine learning classification for any batch instance of this Batch Class, this user can proceed to the Review screen. However, the user is restricted to classify the document per the rules set up at the Document Type level. When the user clicks the Review button, the following message appears: "Machine Learning of a document is not allowed for Current User".

This way, the administrator can assign roles separately for machine learning classification and extraction at the Document Type level.

Same Roles assigned for machine learning classification and extraction

Assume that the administrator assigns the same roles (Role1 and Role2) for both operations - machine learning classification and extraction.

In this case, both users can perform machine learning classification and extraction.

Roles assigned at the Document Type level

In this topic the Roles are assigned at the Document Type level, but no roles assigned at the Batch Class level.

Suppose the administrator does not assign any role for a Batch Class on the Batch Class Management screen.

However, at the Document Type level the administrator assigns Role1 and Role2 for both classification and extraction.

In this case, when users with Role1 and Role2 try to upload a batch on the Upload Batch screen, they will not see that particular Batch Class in the drop-down list in the top panel. They will only see Batch Classes that have been assigned to them at the Batch Class Management screen.

Also, users with Role1 and Role2 cannot work with batch instances created under this Batch Class and cannot see them on the Batch List screen, unless a custom script is deployed giving them permission to perform machine learning for a particular batch.