What's new

This topic contains information about the new features and enhancements included with Kofax OmniPage Capture SDK 22.0.0.

Sharper text with MRC compression Level 5

The CSDK Engine supports various MRC (Mixed Raster Content) compression levels. In CSDK version 22.0, the former MRC compression Level 5 became available as Level 4. Level 5 is now updated to use a more advanced MRC compression algorithm, which builds on a high-resolution (900 dpi) selector layer, resulting in much sharper text and characters with smoother contour.

You can use the MRC compression with fast processing without running OCR and still providing high-quality output. However, OCR helps to achieve even better text/graphics separation.

Improved handwritten text recognition accuracy

The feature introduces an external layer that modifies the input and output of the RER engine responsible for hand-printed zone recognition. In addition to that, you can now filter both the set of characters and character shapes for each hand-printed zone, improving recognition accuracy considerably.

The following new elements are available for filtering:

  • FILTER_PLUS_*: New filter bits are available to specify different sets of characters for each hand-printed zone.

  • kRecSetFilterPlusEx: New function for extended filter settings.

  • Kernel.OcrMgr.Handprint.FullCharacterSet: New setting for filtering the shape of handwritten characters. The engine is able to recognize many different forms of the same character, even rare outlines. Still, the setting comes with a default FALSE value, favoring a common handwriting style for improved recognition accuracy.

Identifying handwritten zones on forms

Form Template Editor now leverages the handprint detection module based on a Convolutional Neural Network (CNN). The module classifies each pixel on the image based on the adjacent pixels and decides whether they belong to a machine-printed or hand-printed text.

The user, running the Form Template Editor (FTE), can set automatic detection for zones. For this purpose, the CSDK API introduced the FM_AUTO_HAND field attribute in the set of filling methods. When the FTE user selects automatic handprint detection for a field, this attribute applies to the zone, so the kRecRecognize function runs the handprint detection module on the zone's area. If kRecRecognize classified most of the black pixels as part of handwritten text, it treats the zone accordingly. Depending on this result the zone's filling method is changed to either FM_OMNIFONT or FM_HANDPRINT, initiating the processing with the appropriate engine. Refer to the Choosing the filling method for user zones topic in the help.

For better recognition accuracy, consider filtering, which is available for forms also. See the Improved handwritten text recognition accuracy topic for details.

Added the Python API

CSDK now offers a Python interface layer on top of the native C API. The CSDK Python API resembles the native C API to minimize the learning curve, but it still has some special considerations:

  • Hides the mechanism of memory handling from the front of the Python programmer.
  • Uses array-like parameters to provide convenient mapping for native Python containers.
  • Uses a NumPy-based image interface enabling image input and output from and to NumPy.

The Python API is similar to the Java API, however there are specializations when using the Python API:

  • Output parameters: If the native CSDK API function modified the parameter, Python returns the parameter values as a tuple.
  • Checking the validity of the high-level API layer: Python interprets the source code with no static semantic checking, so semantic errors appear at runtime only.

For details on how to use the CSDK Python API, refer to the RecAPI Python support topic in the RecAPI help.

Profile-based processing

In CSDK, the environment affecting on the result of a workflow can be very complex: intent, source format, processing options, output format, and others are involved. Despite the high-quality samples provided with CSDK, novice users may face difficulties finding the right set of API calls and settings to get optimal results. Profile-based processing helps to overcome this issue with built-in Profiles containing optimized settings tuned for a common scenario. Instead of applying individual settings for each step of the workflow, developers can select a profile as the first step, then fine-tune settings and conduct the workflow.

Besides usual settings, Profiles can incorporate the following intents:

  • Indexing (with text)

  • Archiving (without text)

  • Unstructured text (sure text)

  • Tabulated data (all tables on separate sheet)

  • Data extraction (templated)

  • Format retention (true copy)

  • Content reuse (essay mode)

  • Editable copy (flowing page with headers and footers)

  • Barcode reading

  • Form creation (LFR)

Profiles can contain information on the input type and structure to allow more efficient optimization:

  • Source type

    • Scanned document

    • Camera document

    • Screen capture

    • PDF/XPS document

  • Relationship between pages

    • Mixed document with unrelated pages

    • Related pages (like books or contracts)

  • Content

    • General text

    • Templated form

    • Free form

    • Tables (spreadsheet mode)

  • Scripts

    • Printed text only

    • Possible others

      • Barcode

      • Handprint

Using Profiles not only accelerates development but can optimize speed or quality. For example, if the intent is to make an image searchable, that allows to speed up the processing since no need for keeping track of the logical layout, fonts, or the pictures on the page. In addition, we can select a speedier OCR engine combination.

See Built in scenarios for the list of built-in profiles and samples.

Automatic machine-printed, hand-printed, and barcode zone detection

Earlier versions of CSDK already supported the recognition of handwritten characters and barcodes besides the machine-printed text. However, the API did not have a unified tool for zoning and detection. CSDK v22 introduces kRecLocateZonesEx, a new API function that automatically detects zones and sets the filling method according to the zone content. An extra bitfield argument describes the different filling methods possibly used on the page, and the areas (zones) written on those ways are detected automatically. The supported filling methods and their corresponding kRecLocateZonesEx bit field bits are the following:

  • FM_OMNIFONT – LZX_OMNIFONT: machine print zones

  • FM_HANDPRINT – LZX_HANDPRINT: handprint zones

  • FM_BARCODE – LZX_BARCODE: 1D or 2D barcode zones

By setting the FM field for each detected zone, kRecLocateZonesEx denotes the appropriate engine for recognition. Using kRecLocateZonesEx simplifies the development process and improves the OCR accuracy by separating handwritten and machine-printed text automatically. For details, refer to the Automatic handwriting and zone detection topic in the help.

Improved mixed Arabic-English OCR accuracy

CSDK now applies English recognition on Arabic zones to look for English text embedded in the Arabic. With this method, CSDK can effectively locate English text within the Arabic text in great detail: English areas may cover a section, a sentence, or even just a word. Using the appropriate recognition engine per language provides more accurate recognition results.

User folders for samples

Earlier versions of CSDK placed the samples under the Program Files or Program Files (x86) folder, so editing required an Administrator account or elevated privileges. The CSDK v22 installer copies all samples under the ProgramData folder and installs symbolic links to the shared resources and binaries, so these samples are editable for all users. It also creates a Start menu shortcut for this folder.

On the Start menu, click the shortcut according to your CSDK edition to open the folder:

  • 32-bit edition: Start > OmniPage Capture SDK 22 x86 > Open MySamples Folder x86

  • 64-bit edition: Start > OmniPage Capture SDK 22 x64 > Open MySamples Folder x64