Difference between Document Assembler and Advanced Document Assembler

When the Advanced DA Switch is enabled in the DOCUMENT_ASSEMBLER plugin, the plugin runs using the ADVANCED_DOCUMENT_ASSEMBLER algorithm.

The main difference between the two algorithms is their separation method:

Document Assembler Algorithm
Looks at the highest confidence value for each page. When it finds a "first page", it starts a new document.
Advanced Document Assembler Algorithm
Forward and reverse page-level look-aheads and look-behinds to all alternate values are applied to a proprietary algorithm. Decision making is based on every permutation of pages and alternate value information in the .xml file.

Both algorithms use the same weighting factors and classification method to generate document classification confidence scores. This is as follows:

  • DA Rule first-middle-last page: 100

  • DA Rule first-page: 50

  • DA Rule middle-page: 25

  • DA Rule last-page: 50

  • DA Rule first-last page: 75

  • DA Rule first-middle page: 50

  • DA Rule middle-last page: 50

For more information, see Document Assembler plugin.