RecAPI
|
Layout Retention Output. RecAPIPlus level of CSDK is supported on: Windows, Linux, Embedded Linux, MacOS. More...
Classes | |
struct | _OUTPUTCONVERTERINFO |
Output document converter information (ANSI) More... | |
struct | OUTPUTCONVERTERINFOW |
Output document converter information (Unicode) More... | |
Macros | |
#define | OUTPUTCONVERTERINFO WORA(OUTPUTCONVERTERINFO) |
Output document converter information. | |
Typedefs | |
typedef struct _OUTPUTCONVERTERINFO | OUTPUTCONVERTERINFOA |
Output document converter information (ANSI) | |
Functions | |
RECERR RECAPIPLS | RecSetOutputFormat (int sid, LPCTSTR pFormatname) |
Set the output format. | |
RECERR RECAPIPLS | RecGetOutputFormat (int sid, LPTSTR pFormatname, int len) |
Ask the output format. | |
RECERR RECAPIPLS | RecGetFirstOutputFormat (LPTSTR pFormatname, int len) |
Start the enumeration of the output formats. | |
RECERR RECAPIPLS | RecGetNextOutputFormat (LPTSTR pFormatname, int len) |
Continue the enumeration of the output formats. | |
RECERR RECAPIPLS | RecGetOutputFormatInfo (LPCTSTR pFormatName, OUTPUTCONVERTERINFO *pInfo) |
Get information about the specified output document format converter. | |
RECERR RECAPIPLS | RecGetOutputSettingsHandle (int sid, HSETTING *hSetting) |
Gets the settings handle for the currently set output format. | |
RECERR RECAPIPLS | RecSetOutputLevel (int sid, OUTPUTLEVEL outLevel) |
Set the level of format retention for the final output document. | |
RECERR RECAPIPLS | RecGetOutputLevel (int sid, OUTPUTLEVEL *poutLevel) |
Ask the current level of format retention for the final output document. | |
Layout Retention Output. RecAPIPlus level of CSDK is supported on: Windows, Linux, Embedded Linux, MacOS.
RecAPIPlus provides complex accurate layout retention outputs with several file formats such as RTF, DOC, WordML, XLS, PDF, WP, WAV. The RecConvert2Doc and RecProcessPagesEx functions export the given document into the previously mentioned output file formats. See the details about support of different output formats on different platforms.
In several cases our goal is to retain the original layout in the output document, as far as possible. The different converters have different capabilities for retaining the layout. There are 5 output levels (OUTPUTLEVEL) for the several layout retentions. Not every converter can realize every output mode. For example a Word document or a PDF document has Flowing Page and True Page modes, which are very similar to the original output and there are simple text converters, which can retain only the simple text in Plain Text mode (formerly No Format mode) and the text with its attributes in Formatted Text mode (formerly Retain Font and Paragraphs mode).
Besides the output modes, converters have many settings, which can influence the layout (list of converter settings). You can use these settings through the Settings Manager Module.
When scanning pages from a document with uniform margins, typically the page images do not place the body text content in precisely the same position on each page, due to scanning variations. Previously, users had to manually restore uniform margins after the recognition result was exported. This toolkit examines incoming pages and if it determines that they have similar text area and layout, page consolidation is automatically performed. The program calculates ideal margins, then identifies a vector for each page describing the difference between the actual and ideal margins. These vectors are then applied during the output process to the following file types: RTF, WordML, PDF, DOCX and XPS. This consolidation is totally automatic and cannot be influenced. However, the User can decide whether the converter should apply these vectors or not, by using the setting ConsolidatePages of the given converters.
#define OUTPUTCONVERTERINFO WORA(OUTPUTCONVERTERINFO) |
Output document converter information.
On Windows this type can be used as OUTPUTCONVERTERINFOA or OUTPUTCONVERTERINFOW depending on _UNICODE macro. On Linux and MacOS this is equivalent to OUTPUTCONVERTERINFOA
.
typedef struct _OUTPUTCONVERTERINFO OUTPUTCONVERTERINFOA |
Output document converter information (ANSI)
This structure describes the converter module and its target format. This is used by the RecGetOutputFormatInfo function.
enum DocFormatter_Mode |
Document formatter methods.
These are the possible values of the setting Formatter.df.mode.
enum OUTPUTLEVEL |
Output level of the exported document.
Pre-defined levels of the format retention for the final output document. The different property values belonging to these settings are documented in the RecSetOutputLevel function. See also the table of the supported output levels by each converters.
enum R2_HEADERS_RETENTION |
HeadersFooters.
You can set how headers and footers should be handled. You can set it for every converter, but the default value is different. For more information, see the setting HeadersFooters in the summary table of converter settings.
enum R2_PAGEBREAKS |
PageBreaks.
For several converters you can set how you want page breaks to be handled. For more information, see the setting PageBreaks in the summary table of converter settings.
enum R2_PICTURES_BPP |
Picture color.
For several converters, you can set the color of the image. For more information, see the setting PictureColor in the summary table of converter settings.
enum R2_PICTURES_DPI |
Pictures.
For every converter you can set how you would like to handle images. The default values are different for the different converters. For more information, see the setting Pictures in the summary table of converter settings.
enum R2_TABLES_RETENTION |
Tables.
For every converter, except the Excel and Html converters, you can set how you would like to handle tables. For more information, see the setting Tables in the summary table of converter settings.
enum TColorQValues |
Color quality.
For the PDF converters you can set the color quality. The default is R2ID_PDFCOLORQUALITY_MIN for every PDF converter. For more information, see the setting ColorQuality in the summary table of converter settings.
enum TMRCTypeValues |
MRC use.
For PDF converters you can set the MRC type. The default is: R2ID_PDFMRC_NO for every PDF converter. For more information, see the setting UseMRC in the summary table of converter settings. The newer ones can be used with 5 different predefined levels (1-5) by calling kRecSetCompressionLevel. See the Tungsten Omnipage Capture SDK User's Guide for more details, in the Imaging Module, MRC image compression level comparison subsection. About MRC Level in Saving MRC PDF files in KernelAPI Saving MRC PDF files in KernelAPI.
Compatibility.
For the PDF converters you can set this compatibility value. For more information, see the setting Compatibility in the summary table of converter settings.
Display mode.
For any PDF converters you can set this mode values specifying how the pdf file should be displayed when opened.
Page layer.
For any PDF converters you can set this values specifying the pdf page layout when opened.
enum TPDFSecurityValues |
PDFSecurity type.
For PDF converters you can set the security type. For more information, see the setting PDFSecurity.Type in the summary table of converter settings.
enum TSignatureTypevalues |
Signature type.
For the PDF converters you can set the signature type. The default is: R2ID_SIGTYPENONE for every PDF converter. For more information, see the setting Signature.SignatureType in the summary table of converter settings.
enum TWriteIndex |
Index Page.
You can switch on the Index Page generation in simple or 'InFrame' mode using HTML output converters. If it is switched on, an index page is generated with links to the recognized and converted pages. In this case, you can change the text of the navigation links by changing NavNextText, NavPrevText or NavTOCText. For more information, see the setting IndexPage in the summary table of converter settings.
RECERR RECAPIPLS RecGetFirstOutputFormat | ( | LPTSTR | pFormatname, |
int | len ) |
Start the enumeration of the output formats.
This starts the enumeration of the document output formats in the current thread.
[out] | pFormatname | Buffer containing the converter name. |
[in] | len | Length of the buffer. |
RECERR |
RecGetFirstOutputFormat
and RecGetNextOutputFormat in C#: RECERR RECAPIPLS RecGetNextOutputFormat | ( | LPTSTR | pFormatname, |
int | len ) |
Continue the enumeration of the output formats.
This continues the enumeration of the document output formats in the current thread.
[out] | pFormatname | Buffer containing the converter name. |
[in] | len | Length of the buffer. |
RECERR |
RecGetNextOutputFormat
in C#: RECERR RECAPIPLS RecGetOutputFormat | ( | int | sid, |
LPTSTR | pFormatname, | ||
int | len ) |
Ask the output format.
This asks the output document format for the RecConvert2Doc, RecProcessPagesEx functions.
[in] | sid | Settings Collection ID. |
[out] | pFormatname | Buffer containing the converter name. |
[in] | len | Length of the buffer. |
RECERR |
RECERR RECAPIPLS RecGetOutputFormatInfo | ( | LPCTSTR | pFormatName, |
OUTPUTCONVERTERINFO * | pInfo ) |
Get information about the specified output document format converter.
[in] | pFormatName | The name of the output conversion format. |
[out] | pInfo | Pointer to an OUTPUTCONVERTERINFO variable. |
RECERR |
RECERR RECAPIPLS RecGetOutputLevel | ( | int | sid, |
OUTPUTLEVEL * | poutLevel ) |
Ask the current level of format retention for the final output document.
[in] | sid | Settings Collection ID. |
[out] | poutLevel | Pointer to output level variable. |
RECERR |
Gets the settings handle for the currently set output format.
[in] | sid | Settings Collection ID. |
[out] | hSetting | Pointer to the setting handle |
RECERR |
RECERR RECAPIPLS RecSetOutputFormat | ( | int | sid, |
LPCTSTR | pFormatname ) |
Set the output format.
It sets the output document format for the RecConvert2Doc, RecProcessPagesEx functions.
[in] | sid | Settings Collection ID. |
[in] | pFormatname | Converter name. |
RECERR |
Converters.Text.DocX
). For more information see the list of converter settings. RecSetOutputFormat
). Thus before this action the mentioned settings cannot be accessed. RECERR RECAPIPLS RecSetOutputLevel | ( | int | sid, |
OUTPUTLEVEL | outLevel ) |
Set the level of format retention for the final output document.
This function can simplify the specifying of output formatting details for the output document.
[in] | sid | Settings Collection ID. |
[in] | outLevel | The output level. |
RECERR |
OutputLevel
is OL_AUTO. *.rtf
and *.docx
and *.pptx
file extensions. These formats try to retain the original page size and layout. If you try to save this, you will get an error message and the file will not be saved.