Detect Empty Pages activity

This activity determines whether an image comes from a blank page. It examines all TIFF and JPEG media of the root document and all sub-documents (recursively). The result is recorded in the Empty flag of the page and the medium. Many XBOUND activities can skip processing of these blank pages, when configured to do so.

ClosedFor the experts: How the algorithm works

  1. The inner area of the image (which is defined by a configurable border zone) is divided into a matrix of rectangular zones.

  2. For each zone, the blackness (number of black pixels divided by the number of white pixels) is calculated.

  3. Two settings are calculated (which have nothing to do with the image geometry):

    • X (blackness) – Average of all blackness values.

    • Y (variance) – Average absolute deviation of the blackness values from X. This is the sum of the absolute values of the differences between blackness and X, divided by the number of zones and by X.

  4. The calculated pair (X, Y) is checked for whether it lies above a threshold line that is drawn between the vertices (0, Y0) and (X0, 0) in a Cartesian coordinate system. X0 is the X threshold, and Y0 is the Y threshold (see below). If the pair (X, Y) lies below this threshold line, the image is considered to be empty.

  5. Thus, the algorithm weighs the overall blackness against the distribution of the black pixels. For example, a uniformly blackened image whose blackness lies below X0 will be recognized as empty.

Tip: Use the Preview link to test the results of your settings.

Note: This activity must be assigned to an Activities Service in order to run.

If for some reason you need to add this activity to the Process Designer, the file to add is xboundActClassifyEmptyPages.dll.

Available settings

The following settings are available when configuring a process step of this activity type.

Properties tab

Number of zones

How many areas the recognition algorithm operates in. Specifying more areas can improve the result, but it takes more time.

Maximum file size (kB)

If a medium is larger than the specified size, it is not analyzed.

Process TIFF images only

Checks only TIFF media.

Process back pages

Checks only back pages.

Set delete flag

When a page is classified as an empty page, it is canceled (it is excluded from further processing in XBOUND).

Binarize for processing

Specifies whether the image is to be binarized before processing.

You must select this setting when detecting empty pages in color or grayscale images such as JPEG. The binarization applies only in this step and does not affect any other process steps.

Document type for empty pages

Select a document type to assign to the medium if it is determined to be an empty page.

X threshold

Threshold setting for detecting empty pages (horizontal direction). The recommended value is 0.5.

Y threshold

Threshold setting for detecting empty pages (vertical direction). The recommended value is 2.5.

Analyze transparent front pages

This function compares non-empty back pages with front pages on the pixel level so that any pixels that are showing through to the opposite side can be excluded. It is recommended only for standardized types of documents, when paper is thin and somewhat transparent.

Note: This function does not perform a cohesive analysis of the image content. Therefore, it must be carefully calibrated using actual images to be processed, and then tested extensively, before use in production.

Note: The Preview function does not show the results of this option.

Distance (pixels)

This parameter specifies how many pixels the front page image is to be "fattened up" before the algorithm is applied. To fatten up means that all of the black figures are made bigger by n pixels in all directions. (ClosedMore information.)

Scanners photograph the front and back pages simultaneously. It is inevitable that the front page shines through to the back and the back is no longer classified as empty even if it is in fact empty. Therefore, the algorithm tries to anticipate the show-through. This parameter controls how strongly to compensate for the show-through.

Advanced tab

Border width

Excludes a margin of this width from recognition.

Border width units

The unit of the margin size.

These settings are also available:

Import button

Imports settings from an XML file that was previously created using Export.

Export button

Exports the settings to an XML file. Specify the file name and location. You can then import the XML file to get the same settings. See Exporting and importing process step settings.

Check regular expression link

Opens a test form, where you can (when applicable) check a regular expression.

Process Images activity

XBOUND activities: Overview

Process Designer plug-in