Standard Document Separation

In standard document separation, an allocated extraction process analyzes the page layout and splits the document in parts. The splitting algorithm fires three different events that can modify the separation result.

Document_BeforeSeparatePages( _
    ByVal pXDoc As CscXDocument, _
    ByRef bSkip As Boolean
    )

You can disable document separation altogether by setting the pXDoc to TRUE.

If the separation process occurs before the first page in a document is classified, an addition event is possible:

Document_XDocPageRotated( _
    ByVal RotationBy As CASCADELib.CscAutoRotation, _
    ByVal pXDoc As CASCADELib.CscXDocument, _
    ByVal PageNr As Long, _
    ByVal Rotation As CASCADELib.CscXDocRotationTypeEnum, _
    ByRef bCancel As Boolean
    )

If the script cannot successfully classify a document, it is rotated by 90° clockwise and classification is re-executed. The page is rotated by 90° until the document is successfully classified or it returns to its original orientation. If the classification is successful for a rotation step, the rotation event is fired. If this is canceled from script by setting bCancel to TRUE, the remaining rotation directions are applied and classification is executed for the page. This is done for all rotation directions where the page is not classified or the rotation is canceled by script. The RotationBy parameter is set to CscAutoRotationByDocumentClassifier.

If the Document_XDocRotated event is executed in Project Builder you cannot access the CscXFolder object. To ensure that any implementation does not terminate the application abnormally, evaluate the (Project.ScriptExecutionMode=CscScriptModeServerDesign) script execution mode.

If a script cannot classify a page it is checked if content classification is required and if recognition is required for that page. This may raise another XDocPageRotated event with theRotationBy parameter set to CscAutoRotationByOCR. This parameter reflects a rotation that is suggested by recognition. If this event is canceled the recognition is re-executed without rotation.

Document_SeparateCurrentPage( _
    ByVal pXDoc As CASCADELib.CscXDocument, _
    ByVal PageNr As Long, _
    ByVal bSplitPage As Boolean, _
    ByRef RemainingPages As Long _
    )

This event is fired for each page in a document that is specified by its PageNr. If a page is recognized as the first page of a document the bSplitPage parameter is set to TRUE. Set the bSplitPage parameter to TRUE to force a split at this position. To skip a defined number of pages when a document type that has a fixed number of pages, set the RemainingPages parameter to the number of pages that belong to the current first page. This event is skipped for all subsequent pages.

Document_AfterSeparatePages( _
    ByVal pXDoc As CscXDocument
    )

This is the final event of the trainable document separation for the given document pXDoc. Here the complete document is defined before document separation is applied. At this time, you can modify the pXDoc.CDoc.Pages(...).SplitPage parameter to flag the pages before the document is split.

The sequence of the standard document separation events is shown in the following example:


An image that shows the standard document separation events.