The PDF step helps you extract content from a PDF document and sign documents using the SignDoc functionality.
The PDF extract feature is not supported on CentOS/Red Hat Enterprise Linux 7.x operating systems.
The Recorder View shows a single page of the PDF document tree and the extracted text. The robot can navigate through the document using the Next Page, Previous Page and Goto Page actions available on the Application Action menu. The menu is available when you right-click the application tab in the Recorder View.
Text extraction results depend on the internal data and structure of the PDF document. The text is split based on the formatting in the PDF document and the underlying accessibility of data and might include text outside the page boundaries or hidden by overlapping elements. If the required accessibility data is missing from (usually older) PDF documents, it might be necessary to use the Extract Text From Image step to extract the text using OCR.
The Extract text application action and the Extract text component action can be used to extract structured text from a specific area of the page.
Properties
- Action
-
Select Open to load a PDF file.
- Document Source
-
-
Local File: Specify the path to the file in the local file system in the File path field.
-
Robot File System: Specify the path to the file in the robot file system in the File path field.
-
Binary: Specify a variable or expression containing a PDF document in binary form.
-
- Password
-
Select this option to specify a password for accessing the PDF if necessary.
- Page number
-
Optionally specify the physical page to show after opening the document. If this property is not specified, the first page is shown.
Application actions
Action |
Description |
---|---|
Goto Page |
Navigates to a page you specify. |
Next Page |
Navigates to the next page. |
Previous Page |
Navigates to the previous page. |
Extract text |
Extracts text from a page area into the selected variable. Specify the following options when extracting the text:
All units are in Device Tree coordinates. |
Insert Image |
Inserts JPEG or PNG image from a local folder to the selected page in the document. RFS folders are not supported. The image is positioned based on X and Y coordinates of the upper left corner of the image relative to the upper left corner of the page. Supported units are:
Note the following rules:
Specify the following options when inserting an image:
|
Insert Image (variable) |
Inserts an image from a variable to the selected page in the document. When inserting an image, specify the same options as in the Insert Image action, but instead of specifying the image path, specify the name of the binary variable with an image. |
Save As |
Inserts a step to save a copy of the document. Specify the full path to save a PDF file. |
Save to Variable |
Saves a copy of the document in a binary variable. |
Close |
Closes |
SignDoc Actions |
|
Sign with SignDoc |
Creates a SignDoc session and submits the PDF document immediately. |
Sign with SignDoc (Template) |
Creates a SignDoc session based on a SignDoc template and submits the PDF document immediately. |
Insert SignDoc Signature Field |
Inserts a Signature Field on the current page of the PDF document. The fields are not visible on the page but they appear in the application tree as SignDoc fields. |
Get SignDoc Property |
Queries properties of the SignDoc session. |
Complete SignDoc Request |
Closes the SignDoc session and determines how the SignDoc package is processed. |
See Sign Documents for information on using SignDoc to sign a document.
Component actions
Action |
Description |
---|---|
Extract text |
Extracts text from the selected component of the PDF document into a variable. Specify the following options when extracting the text:
All units are in Device Tree coordinates. |
SignDoc Actions |
|
Insert SignDoc Signature Field |
Inserts a Signature Field based on a Component Finder. The fields are not visible on the page but they appear in the application tree as SignDoc fields. |
Update Field |
Updates attributes of a form field in the PDF document that is identified by SignDoc as supported. Supported fields are listed in the application tree under the SignDoc node. |
Assign SignDoc Signer |
Assigns a signer to a Signature Field that is already present in the PDF file. These fields are not visible on the page but they appear in the application tree as SignDoc fields. |