What's new

OmniPage Licensing Agent runs on Linux

Kofax OmniPage Capture SDK for Linux now supports running OmniPage Licensing Agent (OPLA) in Linux environments. OPLA introduces Page Pack licenses and central license management. For all other license types, use the OPLicMgr tool. For details on OPLA licensing, refer to the "Managing licenses with OPLA" chapter in the Getting Started Guide.

The Getting Started Guide always uses the OPLA command capitalized in command line examples, but under Linux operating systems, make sure to use the command in all lowercase letters:

opla

PDF/A-4 format support

Capture SDK now supports the following PDF/A-4 formats:

PDF/A-4
PDF/A-4e
PDF/A-4f

Sharper text with MRC compression Level 5

The CSDK Engine supports various MRC (Mixed Raster Content) compression levels. In CSDK version 22.0, the former MRC compression Level 5 became available as Level 4. Level 5 is now updated to use a more advanced MRC compression algorithm, which builds on a high-resolution (900 dpi) selector layer, resulting in much sharper text and characters with smoother contour.

You can use the MRC compression with fast processing without running OCR and still providing high-quality output. However, OCR helps to achieve even better text/graphics separation.

Identifying handwritten zones on forms

Form Template Editor now leverages the handprint detection module based on a Convolutional Neural Network (CNN). The module classifies each pixel on the image based on the adjacent pixels and decides whether they belong to a machine-printed or hand-printed text.

Though Form Template Editor (FTE) runs on Windows only, automatic form processing based on FTE form templates is available in applications developed with Kofax OmniPage Capture SDK for Linux.

The user, running the Form Template Editor (FTE), can set automatic detection for zones. For this purpose, the CSDK API introduced the FM_AUTO_HAND field attribute in the set of filling methods. When the FTE user selects automatic handprint detection for a field, this attribute applies to the zone, so the kRecRecognize function runs the handprint detection module on the zone's area. If kRecRecognize classified most of the black pixels as part of handwritten text, it treats the zone accordingly. Depending on this result the zone's filling method is changed to either FM_OMNIFONT or FM_HANDPRINT, initiating the processing with the appropriate engine. Refer to the Choosing the filling method for user zones topic in the help.

Objects API for .NET

Kofax OmniPage Capture SDK for Linux now offers the Objects API for .Net, including samples. Refer to the RecAPI .NET support topic in the RecAPI help for details on the supported interfaces and limitations.

Java API

The Java Native Interface (JNI) implementation of the RecAPI is added to Kofax OmniPage Capture SDK for Linux, including samples. Refer to the RecAPI Java support topic in the RecAPI help for details on initialization, types, resource handling, and limitations.

Python API

Kofax OmniPage Capture SDK for Linux now offers a Python interface layer on top of the native C API. The CSDK Python API resembles the native C API to minimize the learning curve, but it still has some special considerations:

Hides the mechanism of memory handling from the front of the Python programmer.
Uses array-like parameters to provide convenient mapping for native Python containers.
Uses a NumPy-based image interface enabling image input and output from and to NumPy.

The Python API is similar to the Java API, however there are specializations when using the Python API:

Output parameters: If the native CSDK API function modified the parameter, Python returns the parameter values as a tuple.
Checking the validity of the high-level API layer: Python interprets the source code with no static semantic checking, so semantic errors appear at runtime only.

For details on how to use the CSDK Python API, refer to the RecAPI Python support topic in the RecAPI help.

Profile-based processing

In CSDK, the environment affecting on the result of a workflow can be very complex: intent, source format, processing options, output format, and others are involved. Despite the high-quality samples provided with CSDK, novice users may face difficulties finding the right set of API calls and settings to get optimal results. Profile-based processing helps to overcome this issue with built-in Profiles containing optimized settings tuned for a common scenario. Instead of applying individual settings for each step of the workflow, developers can select a profile as the first step, then fine-tune settings and conduct the workflow.

Besides usual settings, Profiles can incorporate the following intents:

Indexing (with text)
Archiving (without text)
Unstructured text (sure text)
Tabulated data (all tables on separate sheet)
Data extraction (templated)
Format retention (true copy)
Content reuse (essay mode)
Editable copy (flowing page with headers and footers)
Barcode reading
Form creation (LFR)

Profiles can contain information on the input type and structure to allow more efficient optimization:

Source type
- Scanned document
- Camera document
- Screen capture
- PDF/XPS document
Relationship between pages
- Mixed document with unrelated pages
- Related pages (like books or contracts)
Content
- General text
- Templated form
- Free form
- Tables (spreadsheet mode)
Scripts
- Printed text only
- Possible others
  - Barcode
  - Handprint

Using Profiles not only accelerates development but can optimize speed or quality. For example, if the intent is to make an image searchable, that allows to speed up the processing since no need for keeping track of the logical layout, fonts, or the pictures on the page. In addition, we can select a faster OCR engine combination.

Refer to the Built in scenarios topic in the help for the list of built-in profiles and samples.

Automatic machine-printed, hand-printed, and barcode zone detection

Earlier versions of CSDK already supported the recognition of handwritten characters and barcodes besides the machine-printed text. However, the API did not have a unified tool for zoning and detection. CSDK v22 introduces kRecLocateZonesEx, a new API function that automatically detects zones and sets the filling method according to the zone content. An extra bit field argument describes the different filling methods that can be used on the page, and the areas (zones) written on those ways are detected automatically. The supported filling methods and their corresponding kRecLocateZonesEx bits are the following:

FM_OMNIFONT – LZX_OMNIFONT: machine print zones
FM_HANDPRINT – LZX_HANDPRINT: handprint zones
FM_BARCODE – LZX_BARCODE: 1D or 2D barcode zones

By setting the FM field for each detected zone, kRecLocateZonesEx denotes the appropriate engine for recognition. Using kRecLocateZonesEx simplifies the development process and improves the OCR accuracy by separating handwritten and machine-printed text automatically. For details, refer to the Automatic handwriting and zone detection topic in the help.

Improved mixed Arabic-English OCR accuracy

CSDK now applies English recognition on Arabic zones to look for English text embedded in the Arabic. With this method, CSDK can effectively locate English text within the Arabic text in great detail: English areas may cover a section, a sentence, or even just a word. Using the appropriate recognition engine per language provides more accurate recognition results.

Multi-threading support for Asian OCR

Added multi-threading support for the Asian OCR engine to gain performance. On capable hardware, the improvement affects the following languages:

Simplified Chinese
Traditional Chinese
Japanese
Korean
Arabic

Customizable output page size

Earlier versions of CSDK calculated the output page size based on the input image dimensions. CSDK now allows customers to specify the page size for the output file (DOCX).