Tungsten Automation

Tungsten OmniPage Capture SDK 22.2.0.4

Readme Date: March 28, 2025 (Build Date: March 18, 2025)

© 2025 Tungsten Automation. All rights reserved.
Use is subject to license terms.

Introduction

Fix packs issued for OmniPage Capture SDK (CSDK) releases are cumulative updates.

Issues resolved in this fix pack

22.2.0.4 - FIXPACK4

IDSupport Case IDIssue
2185304 N/A PDF/UA-2 and PDF/A-4f compliance issues reported by the veraPDF software tool.
2184903 27929012 LibJpeg, Zlib, Libxml2 has security vulnerabilities.
2183627 27923045 API_GPFAULT_ERR occurred while recognizing with Arabic OCR-language setting (file-specific).
2183330(27135864)Accuracy: issues with FM_MICR default filling method - page recognition.
2183239 (27910659) SPL_MEMORY_ERR error occurred during recognition when using User Dictionary.
2179539 (27837355) While processing certain extremely low quality images, kRecPreprocessImg() returned with IMG_SIZE_ERR error (file-specific).
2179281 (27904510) PDF to Excel conversion, all values in columns were merged.
2177205 27910308 Accuracy: uppercase O ('O') was recognized instead of zero ('0') on image document with few text lines.
2177100 (27135224) Recognition of very poor-quality B/W images takes significantly more time than expected.
2176915 27906847 IMF_FILEFORMAT_ERR occurred while loading PDF file (file-specific).
2175930 (27263393) GPF occurred while recognizing the barcodes on the pages of a multi-page TIFF image file (file-specific).
2175506 27017286 A new setting was introduced to control hyphens/dashes in the DOCX while converting PDF to DOCX.
2175282 27888563 Missing comma in number.
2171502 N/A Regression: IMF_COMP_ERR error occurred while loading PDF (file-specific).
2171333 27450718 Redaction resulted in blurred image in output MRC PDF, which only partly hides the redacted text area.
2171029 27137451 Accuracy: S5 and o0 misrecognitions occurred.
2170740 27433219 When applying CSDK's text redact feature with TO_BALANCED tradeoff setting, the output MRC PDF contained enormously high black rectangle.
2170355 (27260893) Barcode located outside a user-zone was recognized.
2170072 N/A Regression: IMG_RECT_ERR error occurred while converting the recognition result to DTXT_IOTPDF_MRC output.
2169512 (27347197) Failed to process Swiss QR (file-specific).
2165255 27149149 Footer text was converted not in footer area but as body text.
2162828 26946453 Textboxes were created when converting PDF to DOCX.
2161309 27175624 IMF_FILEFORMAT_ERR was returned when loading PDF (file-specific).
2159824 27132853 After loading the attached PDF, a large black rectangle appeared in the middle of the image.
2159195 (26988007) Incorrect recognition of Arabic numbers.
2159110 (26870910) Enh.Req.: The Hebrew Shekel currency sign was not recognized.
2142388 (26751717) Regression: Greek invoice number was misrecognized: Uppercase O chars ('O') were recognized instead of Zeros ('0').
2141318 (26992947) Conversion to DOCX resulted in wrong borders.
2132130 (26970426) Some words in text-PDF were missing from the OCR result (file-specific).
2121685 26936458 DOCX output was incorrectly formatted.
2103379 26862605 Text on the right was not seen in the header of the Word output.
2101543 26886169 The IPRO calling sequence documented in IPRO.chm for instant workflows did not generate an output file.
1946872 (26684351) IBAN number was misrecognized (Uppercase character D is recognized as digit 0).
1820228 (26540723) When a specific PDF was converted into PPTX, some images were missing and other conversion issues occurred.
1731803 26426118 CA Pleading line numbers conversion.

NOTE for fix pack 22.2.0.4 - FIXPACK4

  1. For 2175506: The following setting has been introduced: "Kernel.OcrMgr.PostProc.HyphenMode". Possible values: -- 0: Auto mode based on language -- 1: Prefer soft hyphen detection. The EOL hyphen is soft if the long word is a dicitonary word. -- 2: Prefer hard hyphen detection: The EOL hyphen is hard if both parts are dictionary words. -- 3: Always return hard hyphen: The EOL hyphen is always converted to hard hyphen.

    CSDK by default decides if soft hyphens are preferred or not, based on the current language. Hard hyphens are preferred on the following languages: English, Spanish, Portuguese, Italian, Greek, Turkish.

Issues resolved in previous fix packs

22.2.0.3 - FIXPACK3

IDSupport Case IDIssue
215819627132750Black rectangles were appearing in the output image where there were hyperlinks on the input PDF page.
215749727136168API_GPFAULT_ERR occurred during recognition (file-specific).
2153108(27012340)False positive detection of PatchCode.
2150756(27005392)Invoice number and date were extracted as a single word instead of two words.
2149219N/ARegression: Single-digit value ('1') of a table cell on the input TIFF image was missing from the OCR output.
214857326989160Regression: Arabic Numeral Degradation in 22.1 FP5.
2148528(26943954)Quantity column values were extracted as different words.
2147508(26940304)Last digit of the recognized number was missing from the OCR output when LZ_FREEFORM page description was specified.
2147494(26767072)Reference number ID was missing from the OCR result when LZ_FREEFORM page description was specified.
2147236(26982951)Values were combined with dates instead of separate words.
2147214N/ARegression: Empty OCR result or no error.
214571326998838Text color in output PDF was incorrect.
214571126861187[LCP-28979] Problem with the footer in Word output.
214567526374495[IBM Corp.] The Engine incorrectly marked inaccurate text as High Confidence.
2144897(26684424)Accuracy: IDs with italic font on the input PDF were misrecognized ('1' vs 'J' and 'V' vs 'Y').
2144482(27009050)Unreadable characters were appearing when PDF was loaded (file-specific).
2143797(27001186)IMG_ACCESS_ERR occurred while recognizing Korean image (file-specific).
214327826983305The color of characters was changed in searchable PDF MRC output when an Asian OCR language was set.
2142879N/ATimeout occurred when C128 barcode extraction was enabled.
2142857(26990700)Regression: PDF to PPTX conversion issue.
2142346(27001186)API_GPFAULT_ERR occurred while recognizing Korean document (file-specific).
214129326970313English text was recognized as Arabic.
214067826936444[LCP-30406] Missed paragraph marks at the end of paragraphs with PROCESSING_NORMAL processing mode option.
2140537N/AAPI_GPFAULT_ERR occurred during recognition (file-specific).
2138129(26988007)The alignment of the Arabic texts was incorrect (occurs since switched to CSDK 22.2).
2137993(26989815)IMG_SIZE_ERR occurred when loading the last page of a PDF document (file-specific).
2137965(26992679)API_ERROR_ERR occurred when opening PDF file with the RecPDF library (file-specific).
2135862(26986326)Extra characters were added in front of Currency.
213555026979027Corrupt TIFF file was created when FF_TIFJPGNEW format was saved while using the Stream API.
2130058(26938264)Some Chinese text was not recognized correctly.
213040526936246[LCP-30406] All text was bold with PROCESSING_MODE.PROCESSING_NORMAL option.
2129468(26957905)Accuracy: Important number on document surrounded by a rectangle was misrecognized.
2107061(26897041)Incorrect numerical values were recognized from a standard PDF page (file-specific).
1859787(26596995)The kRecRecognize function returned RER_INTERNAL_ERR when processing specific images.
1837282N/ANot visible, truncated characters those locating in PDF's table cells were appearing in the OCR output.
1814379N/APDF's invisible text appeared in the recognition result (clipping).

NOTE for fix pack 22.2.0.3 - FIXPACK3

  1. For 2141293: This complex processing of mixed Arabic and English pages is over the technology limit of CSDK. We suggest they switch on Single Language Detection that would create better English output on the attached image. We know that this is not satisfactory for the user as no Arabic text would be recognized, but this is our best suggestion.

22.2.0.2 - FIXPACK2

IDSupport Case IDIssue
2136736N/AFile lock was not released by CSDK.
2136637N/ASome pages of an unhealthy PDF took extremely long time to process (file-specific).
2136527N/ARegression: Recognition server session ended with application crash when using CSDK for a long time.
2136055(26969374)Accuracy: Space was missing in the OCR result from multiline table-cell.
2134259(26698778)API_GPFAULT_ERR occurred with Arabic+Korean.
2134253(26936471)[LCP-30406] Soft line breaks instead of paragraph marks.
2132820N/AAccuracy: OmniPage returned the invoice number twice.
2132819(26926900)[LCP-30316] Incorrect conversion of table and background to DOCX format.
2131932(26932151)GPF occurred in the MOR.dll during recognition (file-specific).
2131930(26870935)Accuracy: 0-O misrecognition occurred with image-only input PDFs.
2131928(26921469)Accuracy: Reference number on bank document was misrecognized ('1' vs 'l').
2131922(26870935)Accuracy: Hebrew misrecognitions occurred with textual input PDFs.
2131921(26806594)Regression: API_HARDTIMEOUT_ERR occurred during recognition with LZ_FREEFORM (file-specific).
2131919(26950837)Regression: the RecConvert2Doc function returned L_ERROR_CONVERTER if the CSDK setting Converters.Text.DocX.SplitMaxPages was set to a value other than its default (0).
2131901(26882191)[LCP-29048] Conversion of the footer to DOCX format resulted in unstructured layout.
2131897(26944840)[LCP-31271] Corrupted output after conversion of PDF to Word format.
2131676(26612250)[CRL-4762] An API_GPFAULT_ERR occurred while recognizing with combined OCR language setting LANG_ENG + LANG_ARA + LANG_KRN.
2131674(26612250)[CRL-4762] API_GPFAULT_ERR occured during the kRecLocateZones() call with Arabic+Korean language setting.
2131669(26867720)Accuracy: 0-O misrecognitions occurred.
2131659(26958665)Spaces were missing between words in the recognized output.
2130078(26938264)Email type URLs were recognized with mixed font type.
2130077N/ASome texts did not fit into their boxes.
2130075N/ASome redundant symbols showed up in cells.
2130074N/ASome cell values were placed into the wrong column.
2130072N/ASuperscript for Chinese text was not recognized.
2130069N/AContent of some tables was not completely recognized.
2130063N/AChart recognition issue: some pictograms were not recognized as images.
2130053N/ASome bigger sized numbers in the titles were not recognized.
2130045N/ASome images around the header/footer area were not always preserved.
2129718(26869135)Accuracy: a character was missing from the recognized FTE field.
2128831N/AAccuracy: OmniPage misrecognition problems.
2128825(26966206)Intelligent Mail (USPS) and Postnet barcodes were misrecognized.
2128513N/AImprovement in quality of conversion with Traditional Chinese PDF documents.
2126877(26888911)Workflow crashed during recognition when using training data.
2126867(26822385 )Accuracy: Slash '/' characters were misrecognized as character 'I' or '1'.
2126863N/AInvalid LETTER left position of 65535 for QR-code appeared in the OCR result (file-specific).
2126242(26890479)Accuracy: miscellaneous misrecognitions ('I'-'1' and 'Q'-'O') occurred.
2125379N/AGPF occurred during recognition when applying combined Asian + other OCR-language.
212373426963771OCR Zone border settings for machine printed zones.
206564826803319Regression: File named with Chinese characters could not be opened with CSDK 22.
1814351(26539185)Thai language OmniPage OCR results differred drastically from the results of FineReader.

Applies to

You can apply this fix pack to update the following Tungsten OmniPage CSDK for Windows version:

Install this fix pack

Use the following procedure to Install this fix pack.

  1. Verify that the following applications and services are not running:
  2. Backup the content of the Bin folder of your Capture SDK 22.2 installation (to the backup-copy folder).
  3. Depending on your existing Capture SDK installation, select either the file TungstenOmniPageCaptureSDK-22.2.0.4_forWindows_32-bit.zip or TungstenOmniPageCaptureSDK-22.2.0.4_forWindows_64-bit.zip.
    Unzip the fix pack file to a temporary location.
  4. Copy the files and subfolders located in the fix pack's folder (CSDK_BIN32 or CSDK_BIN64) into the Bin folder of your current Omnipage Capture SDK installation.
    IMPORTANT:
    • Remember to copy all files with the subfolders! (recursive copy)
  5. If you use the standalone version of Document Classifier and there is an Engine subfolder under the folder DocumentClassifier, refresh the files in the Engine subfolder from the CSDK_BIN32 or CSDK_BIN64 folder of this fix pack.
  6. Similarly, if you use the standalone version of Form Template Editor and there is an Engine subfolder under the folder FormTemplateEditor20, refresh the files in the Engine subfolder from the folder CSDK_BIN32 or CSDK_BIN64 of this fix pack.
  7. If you have runtime deployment, remember to update that as well:
    In the Distribution File Set (generated earlier), replace the original files with the new ones from the fix pack.
    Remember to add new CSDK files introduced since the version 22.2 release, if there are any in any of the fix packs.
    Note: As the best practice, after applying the updates to the developer installation, generate the Distribution File Set again, using the Distribution Wizard.
  8. If you are using OmniPage Licensing Agent (OPLA), use its updated version from the OPLA subfolder of this fix pack.
    The OPLA subfolder contains all the files necessary for OPLA.
  9. Restart any applications and services you stopped before installing the fix pack.

Remove this fix pack

Use the following procedure to remove this fix pack.

  1. Verify that the following applications and services are not running:
  2. Copy the files from the backup-copy folder to the Bin folder of your OmniPage Capture SDK 22.2 installation.
  3. Restart any applications and services that were stopped prior to removing the fix pack.

Files included

This fix pack includes a vast number of files. This document does not detail file names and versions.