Tungsten OmniPage Capture SDK 22.2.0.4

Readme Date: March 28, 2025 (Build Date: March 18, 2025)

Introduction

Fix packs issued for OmniPage Capture SDK (CSDK) releases are cumulative updates.

Issues resolved in this fix pack

22.2.0.4 - FIXPACK4

ID	Support Case ID	Issue
2185304	N/A	PDF/UA-2 and PDF/A-4f compliance issues reported by the veraPDF software tool.
2184903	27929012	LibJpeg, Zlib, Libxml2 has security vulnerabilities.
2183627	27923045	API_GPFAULT_ERR occurred while recognizing with Arabic OCR-language setting (file-specific).
2183330	(27135864)	Accuracy: issues with FM_MICR default filling method - page recognition.
2183239	(27910659)	SPL_MEMORY_ERR error occurred during recognition when using User Dictionary.
2179539	(27837355)	While processing certain extremely low quality images, kRecPreprocessImg() returned with IMG_SIZE_ERR error (file-specific).
2179281	(27904510)	PDF to Excel conversion, all values in columns were merged.
2177205	27910308	Accuracy: uppercase O ('O') was recognized instead of zero ('0') on image document with few text lines.
2177100	(27135224)	Recognition of very poor-quality B/W images takes significantly more time than expected.
2176915	27906847	IMF_FILEFORMAT_ERR occurred while loading PDF file (file-specific).
2175930	(27263393)	GPF occurred while recognizing the barcodes on the pages of a multi-page TIFF image file (file-specific).
2175506	27017286	A new setting was introduced to control hyphens/dashes in the DOCX while converting PDF to DOCX.
2175282	27888563	Missing comma in number.
2171502	N/A	Regression: IMF_COMP_ERR error occurred while loading PDF (file-specific).
2171333	27450718	Redaction resulted in blurred image in output MRC PDF, which only partly hides the redacted text area.
2171029	27137451	Accuracy: S5 and o0 misrecognitions occurred.
2170740	27433219	When applying CSDK's text redact feature with TO_BALANCED tradeoff setting, the output MRC PDF contained enormously high black rectangle.
2170355	(27260893)	Barcode located outside a user-zone was recognized.
2170072	N/A	Regression: IMG_RECT_ERR error occurred while converting the recognition result to DTXT_IOTPDF_MRC output.
2169512	(27347197)	Failed to process Swiss QR (file-specific).
2165255	27149149	Footer text was converted not in footer area but as body text.
2162828	26946453	Textboxes were created when converting PDF to DOCX.
2161309	27175624	IMF_FILEFORMAT_ERR was returned when loading PDF (file-specific).
2159824	27132853	After loading the attached PDF, a large black rectangle appeared in the middle of the image.
2159195	(26988007)	Incorrect recognition of Arabic numbers.
2159110	(26870910)	Enh.Req.: The Hebrew Shekel currency sign was not recognized.
2142388	(26751717)	Regression: Greek invoice number was misrecognized: Uppercase O chars ('O') were recognized instead of Zeros ('0').
2141318	(26992947)	Conversion to DOCX resulted in wrong borders.
2132130	(26970426)	Some words in text-PDF were missing from the OCR result (file-specific).
2121685	26936458	DOCX output was incorrectly formatted.
2103379	26862605	Text on the right was not seen in the header of the Word output.
2101543	26886169	The IPRO calling sequence documented in IPRO.chm for instant workflows did not generate an output file.
1946872	(26684351)	IBAN number was misrecognized (Uppercase character D is recognized as digit 0).
1820228	(26540723)	When a specific PDF was converted into PPTX, some images were missing and other conversion issues occurred.
1731803	26426118	CA Pleading line numbers conversion.

NOTE for fix pack 22.2.0.4 - FIXPACK4

For 2175506: The following setting has been introduced: "Kernel.OcrMgr.PostProc.HyphenMode". Possible values: -- 0: Auto mode based on language -- 1: Prefer soft hyphen detection. The EOL hyphen is soft if the long word is a dicitonary word. -- 2: Prefer hard hyphen detection: The EOL hyphen is hard if both parts are dictionary words. -- 3: Always return hard hyphen: The EOL hyphen is always converted to hard hyphen.

CSDK by default decides if soft hyphens are preferred or not, based on the current language. Hard hyphens are preferred on the following languages: English, Spanish, Portuguese, Italian, Greek, Turkish.

Issues resolved in previous fix packs

22.2.0.3 - FIXPACK3

ID	Support Case ID	Issue
2158196	27132750	Black rectangles were appearing in the output image where there were hyperlinks on the input PDF page.
2157497	27136168	API_GPFAULT_ERR occurred during recognition (file-specific).
2153108	(27012340)	False positive detection of PatchCode.
2150756	(27005392)	Invoice number and date were extracted as a single word instead of two words.
2149219	N/A	Regression: Single-digit value ('1') of a table cell on the input TIFF image was missing from the OCR output.
2148573	26989160	Regression: Arabic Numeral Degradation in 22.1 FP5.
2148528	(26943954)	Quantity column values were extracted as different words.
2147508	(26940304)	Last digit of the recognized number was missing from the OCR output when LZ_FREEFORM page description was specified.
2147494	(26767072)	Reference number ID was missing from the OCR result when LZ_FREEFORM page description was specified.
2147236	(26982951)	Values were combined with dates instead of separate words.
2147214	N/A	Regression: Empty OCR result or no error.
2145713	26998838	Text color in output PDF was incorrect.
2145711	26861187	[LCP-28979] Problem with the footer in Word output.
2145675	26374495	[IBM Corp.] The Engine incorrectly marked inaccurate text as High Confidence.
2144897	(26684424)	Accuracy: IDs with italic font on the input PDF were misrecognized ('1' vs 'J' and 'V' vs 'Y').
2144482	(27009050)	Unreadable characters were appearing when PDF was loaded (file-specific).
2143797	(27001186)	IMG_ACCESS_ERR occurred while recognizing Korean image (file-specific).
2143278	26983305	The color of characters was changed in searchable PDF MRC output when an Asian OCR language was set.
2142879	N/A	Timeout occurred when C128 barcode extraction was enabled.
2142857	(26990700)	Regression: PDF to PPTX conversion issue.
2142346	(27001186)	API_GPFAULT_ERR occurred while recognizing Korean document (file-specific).
2141293	26970313	English text was recognized as Arabic.
2140678	26936444	[LCP-30406] Missed paragraph marks at the end of paragraphs with PROCESSING_NORMAL processing mode option.
2140537	N/A	API_GPFAULT_ERR occurred during recognition (file-specific).
2138129	(26988007)	The alignment of the Arabic texts was incorrect (occurs since switched to CSDK 22.2).
2137993	(26989815)	IMG_SIZE_ERR occurred when loading the last page of a PDF document (file-specific).
2137965	(26992679)	API_ERROR_ERR occurred when opening PDF file with the RecPDF library (file-specific).
2135862	(26986326)	Extra characters were added in front of Currency.
2135550	26979027	Corrupt TIFF file was created when FF_TIFJPGNEW format was saved while using the Stream API.
2130058	(26938264)	Some Chinese text was not recognized correctly.
2130405	26936246	[LCP-30406] All text was bold with PROCESSING_MODE.PROCESSING_NORMAL option.
2129468	(26957905)	Accuracy: Important number on document surrounded by a rectangle was misrecognized.
2107061	(26897041)	Incorrect numerical values were recognized from a standard PDF page (file-specific).
1859787	(26596995)	The kRecRecognize function returned RER_INTERNAL_ERR when processing specific images.
1837282	N/A	Not visible, truncated characters those locating in PDF's table cells were appearing in the OCR output.
1814379	N/A	PDF's invisible text appeared in the recognition result (clipping).

NOTE for fix pack 22.2.0.3 - FIXPACK3

For 2141293: This complex processing of mixed Arabic and English pages is over the technology limit of CSDK. We suggest they switch on Single Language Detection that would create better English output on the attached image. We know that this is not satisfactory for the user as no Arabic text would be recognized, but this is our best suggestion.

22.2.0.2 - FIXPACK2

ID	Support Case ID	Issue
2136736	N/A	File lock was not released by CSDK.
2136637	N/A	Some pages of an unhealthy PDF took extremely long time to process (file-specific).
2136527	N/A	Regression: Recognition server session ended with application crash when using CSDK for a long time.
2136055	(26969374)	Accuracy: Space was missing in the OCR result from multiline table-cell.
2134259	(26698778)	API_GPFAULT_ERR occurred with Arabic+Korean.
2134253	(26936471)	[LCP-30406] Soft line breaks instead of paragraph marks.
2132820	N/A	Accuracy: OmniPage returned the invoice number twice.
2132819	(26926900)	[LCP-30316] Incorrect conversion of table and background to DOCX format.
2131932	(26932151)	GPF occurred in the MOR.dll during recognition (file-specific).
2131930	(26870935)	Accuracy: 0-O misrecognition occurred with image-only input PDFs.
2131928	(26921469)	Accuracy: Reference number on bank document was misrecognized ('1' vs 'l').
2131922	(26870935)	Accuracy: Hebrew misrecognitions occurred with textual input PDFs.
2131921	(26806594)	Regression: API_HARDTIMEOUT_ERR occurred during recognition with LZ_FREEFORM (file-specific).
2131919	(26950837)	Regression: the RecConvert2Doc function returned L_ERROR_CONVERTER if the CSDK setting Converters.Text.DocX.SplitMaxPages was set to a value other than its default (0).
2131901	(26882191)	[LCP-29048] Conversion of the footer to DOCX format resulted in unstructured layout.
2131897	(26944840)	[LCP-31271] Corrupted output after conversion of PDF to Word format.
2131676	(26612250)	[CRL-4762] An API_GPFAULT_ERR occurred while recognizing with combined OCR language setting LANG_ENG + LANG_ARA + LANG_KRN.
2131674	(26612250)	[CRL-4762] API_GPFAULT_ERR occured during the kRecLocateZones() call with Arabic+Korean language setting.
2131669	(26867720)	Accuracy: 0-O misrecognitions occurred.
2131659	(26958665)	Spaces were missing between words in the recognized output.
2130078	(26938264)	Email type URLs were recognized with mixed font type.
2130077	N/A	Some texts did not fit into their boxes.
2130075	N/A	Some redundant symbols showed up in cells.
2130074	N/A	Some cell values were placed into the wrong column.
2130072	N/A	Superscript for Chinese text was not recognized.
2130069	N/A	Content of some tables was not completely recognized.
2130063	N/A	Chart recognition issue: some pictograms were not recognized as images.
2130053	N/A	Some bigger sized numbers in the titles were not recognized.
2130045	N/A	Some images around the header/footer area were not always preserved.
2129718	(26869135)	Accuracy: a character was missing from the recognized FTE field.
2128831	N/A	Accuracy: OmniPage misrecognition problems.
2128825	(26966206)	Intelligent Mail (USPS) and Postnet barcodes were misrecognized.
2128513	N/A	Improvement in quality of conversion with Traditional Chinese PDF documents.
2126877	(26888911)	Workflow crashed during recognition when using training data.
2126867	(26822385 )	Accuracy: Slash '/' characters were misrecognized as character 'I' or '1'.
2126863	N/A	Invalid LETTER left position of 65535 for QR-code appeared in the OCR result (file-specific).
2126242	(26890479)	Accuracy: miscellaneous misrecognitions ('I'-'1' and 'Q'-'O') occurred.
2125379	N/A	GPF occurred during recognition when applying combined Asian + other OCR-language.
2123734	26963771	OCR Zone border settings for machine printed zones.
2065648	26803319	Regression: File named with Chinese characters could not be opened with CSDK 22.
1814351	(26539185)	Thai language OmniPage OCR results differred drastically from the results of FineReader.

Applies to

You can apply this fix pack to update the following Tungsten OmniPage CSDK for Windows version:

OmniPage Capture SDK for Windows 22.2
OmniPage Capture SDK for Windows 22.2.0.2
OmniPage Capture SDK for Windows 22.2.0.3

Install this fix pack

Use the following procedure to Install this fix pack.

Verify that the following applications and services are not running:

Executables belonging to OmniPage Capture SDK 22.2
Applications integrating the engine of OmniPage Capture SDK 22.2
Antivirus software

Backup the content of the Bin folder of your Capture SDK 22.2 installation (to the backup-copy folder).
Depending on your existing Capture SDK installation, select either the file TungstenOmniPageCaptureSDK-22.2.0.4_forWindows_32-bit.zip or TungstenOmniPageCaptureSDK-22.2.0.4_forWindows_64-bit.zip.
Unzip the fix pack file to a temporary location.
Copy the files and subfolders located in the fix pack's folder (CSDK_BIN32 or CSDK_BIN64) into the Bin folder of your current Omnipage Capture SDK installation.
IMPORTANT:
- Remember to copy all files with the subfolders! (recursive copy)
If you use the standalone version of Document Classifier and there is an Engine subfolder under the folder DocumentClassifier, refresh the files in the Engine subfolder from the CSDK_BIN32 or CSDK_BIN64 folder of this fix pack.
Similarly, if you use the standalone version of Form Template Editor and there is an Engine subfolder under the folder FormTemplateEditor20, refresh the files in the Engine subfolder from the folder CSDK_BIN32 or CSDK_BIN64 of this fix pack.
If you have runtime deployment, remember to update that as well:
In the Distribution File Set (generated earlier), replace the original files with the new ones from the fix pack.
Remember to add new CSDK files introduced since the version 22.2 release, if there are any in any of the fix packs.
Note: As the best practice, after applying the updates to the developer installation, generate the Distribution File Set again, using the Distribution Wizard.
If you are using OmniPage Licensing Agent (OPLA), use its updated version from the OPLA subfolder of this fix pack.
The OPLA subfolder contains all the files necessary for OPLA.
Restart any applications and services you stopped before installing the fix pack.

Remove this fix pack

Use the following procedure to remove this fix pack.

Verify that the following applications and services are not running:

Executables belonging to OmniPage Capture SDK 22.2
Applications integrating the engine of OmniPage Capture SDK 22.2
Antivirus software

Copy the files from the backup-copy folder to the Bin folder of your OmniPage Capture SDK 22.2 installation.
Restart any applications and services that were stopped prior to removing the fix pack.

Files included

This fix pack includes a vast number of files. This document does not detail file names and versions.