Conversion format properties

Each conversion format has its own set of properties, and there are some common ones. These properties show up in the properties bar when a conversion format is selected. The following tables describe all properties, grouped by conversion formats and ordered as they appear in the client.

Table 1. Common conversion properties
Property Description
Splitting

OPS splits the result document according to the property value selected.

  • EmptyPage

  • FileSize

  • None (default)

  • PageCount

  • PerInputDocument

  • PerPage

Note Some formats do not offer all the above options for the Splitting property.

Splitting value

Defines the limit for spitting in page count or megabytes when Splitting is set as PageCount or FileSize.

Ignore headers and footers

If this option is selected, OPS does not include headers and footers in the output document.

Table 2. Conversion properties – Pdf format
Property Description
Format

The PDF subformat for the output files.

  • Image: Results in image only PDF files.

  • Merge: This conversion can have a single input file only, which is a PDF. The result of the recognition will be merged into the input file, turning it to a searchable PDF.

  • Normal (default): Results in normal PDF files.

  • Searchable: Results in searchable PDF files.

Image quality

This setting determines the quality of images in the result. Better quality also means larger files.

  • Max: The best quality.

  • Medium (default): Balanced, a good image quality with acceptable file size

  • Min: Optimized for file size.

Compatibility

The compatibility level of the PDF output file.

  • Pdf13

  • Pdf14

  • Pdf15

  • Pdf16 (default)

  • Pdf17

  • PdfA1a

  • PdfA1b

  • PdfA2b

  • PdfA2u

  • PdfA2a

  • PdfA3b

  • PdfA3u

  • PdfA3a

Image color

Specifies the color model for the images in the result document.

  • BlackAndWhite

  • Color

  • Grayscale

  • Original (default)

MRC level

Determines the MRC compression level of the result PDF document. No means to avoid MRC.

  • Good

  • Min

  • No (default)

  • Superb

Apply original image

If this option is selected, OPS uses the original image in the output PDF document if it is possible.

Linearized

If this option is selected, OPS optimizes the PDF file for efficient web display by reordering the file contents. As a result, the first page of the PDF loads quickly into a web page, and then the remaining pages follow.

Table 3. Conversion properties – Office format
Property Description
Format

The format of the Office output files.

  • DocX (default)

  • Pptx

  • Rtf

  • Wordml

  • Xls

  • Xlsx

Output mode

This property determines the formatting level and the layout of the output.

  • Essay: Size and positioning is not in the major focus in essay mode. The separately formatted text blocks have common properties, like font and font size. It allows the end user to format the text easily by changing font and paragraph attributes of large blocks of text with the same style. The style updates affect all text, that belong to the same style, even if the text is in separate blocks. Moreover, removing headers and footers allows users to reformat the whole text more easily and present it as an HTML page.​

  • FlowingPage: This setting keeps the original layout of the pages, including columns. This is done wherever possible with column and indent settings, not with text boxes or frames. Text will then flow from one column to the other, which does not happen when text boxes are used.

  • FormattedText: This settings exports decolumnized text with font and paragraph styling, along with graphics and tables.

    When saving to Xls or Xlsx, each detected table or spreadsheet in the output document is saved to a separate worksheet. The last worksheet is reserved for the remaining content and functions as an index page. The tables are represented by hyperlinks to their own worksheet.

  • PlainText: This setting exports plain decolumnized left-aligned text with uniform font face and size.

  • Spreadsheet: This setting arranges recognition results in tabular form, as recommended for use in spreadsheet applications. Pages are placed in separate worksheets.

  • TruePage (default): This setting follows the original layout of the pages in the output document, including columns and text boxes.

Image color

Specifies the color model for the images in the result document.

  • BlackAndWhite
  • Color

  • Grayscale

  • Original (default)

Table 4. Conversion properties – Text format
Property Description
Format

The format of the Text output files.

  • Text (default)

  • Xml

Code page

You can select from a wide range of code pages, including Unicode and Utf8. Auto is selected by default.

Output mode

This property determines the layout of the output.

Page separator

This string separates the pages. The default value is \f'.

Zone separator

This string separates the zones. The default value is an empty string.

Table 5. Conversion properties – Image format
Property Description
Format

The image file format of the output files.

  • Bmp

  • Gif

  • Jpg

  • Png

  • Tiff (default)

Image quality

This setting determines the quality of images in the result. Better quality also means larger files.

  • Max: The best quality.

  • Medium (default): Balanced, a good image quality with acceptable file size

  • Min: Optimized for file size.

Table 6. Conversion properties – HTML format
Property Description
Image quality

This setting determines the quality of images in the result. Better quality also means larger files.

  • Max: The best quality.

  • Medium (default): Balanced, a good image quality with acceptable file size

  • Min: Optimized for file size.

Output mode

This property determines the layout of the output.

Index page

If this is selected, OPS generates an index page with links to the recognized and converted pages.

Table 7. Conversion properties – eBook format
Property Description
Format

The format of the eBook output file.

  • Epub (default)

  • Kindle

Image quality

This setting determines the quality of images in the result. Better quality also means larger files.

  • Max: The best quality.

  • Medium (default): Balanced, a good image quality with acceptable file size

  • Min: Optimized for file size.

Output mode

This property determines the layout of the output.

  • FormattedText

  • PlainText

  • Poem: This layout is specialized for poems displays correctly on eBook readers.

  • Simple (default): Regular eBook layout.

Image color

Specifies the color model for the images in the result document.

  • BlackAndWhite
  • Color

  • Grayscale (default)

  • Original

Table 8. Conversion properties – Classification format
Property Description
Format

The format of the result file.

  • Json (default)

  • Xml