Locator methods
You can add any of the following locator methods in the Project Builder:
Icon | Locator Method | Description | Extraction Type |
---|---|---|---|
|
A2iA A2iA Zone Locator |
This locator method reads the content of predefined zone(s) and recognizes difficult characters like words in cursive handwriting. The A2iA Zone Locator is licensed separately. The A2iA Zone Locator is now deprecated and no longer supported. The A2iA Zone Locator and the Check and Cursive Recognition engine remain for reference purposes only and are not meant for use in production. However, if you have used an A2iA Zone Locator in a previous release, it is possible to continue to use these locators at your own risk. Ideally, migrate these locators to a supported locator method. Before migrating an A2iA Zone Locator it is necessary to install the Check and Cursive Recognition Add-On. When you migrate an A2iA Zone Locator, the following behavior occurs:
See the Tungsten Transformation Installation Guide for more information on installing the Check and Cursive Recognition Add-On. |
Group |
|
Use the Advanced Evaluator to return the best extraction results from multiple input locators for a set of subfields commonly found on invoice and purchase order documents. For each subfield, define evaluation steps and confidence thresholds for the input locators to find the correct result in a voting process. This evaluator provides comparisons for up to three input locators for typical invoice header data, such as invoice number, order date, different amounts, or other custom subfields. |
Group |
|
|
The Address Locator finds addresses on structured, unstructured, and semi-structured machine-printed US documents. The individual elements of the address blocks are usually located close together and can span across a single horizontal line, or span two to seven lines forming a vertical block. This locator finds these blocks on all pages of a document and returns them as a list of alternatives. The alternatives include subfields for each individual piece of extracted information, such as the name, street, or city. |
Group |
|
|
Use the Advanced Evaluator to return the best extraction results from multiple input locators for a set of subfields commonly found on invoice and purchase order documents. For each subfield, define evaluation steps and confidence thresholds for the input locators to find the correct result in a voting process. This evaluator provides comparisons for up to three input locators for typical invoice header data, such as invoice number, order date, different amounts, or other custom subfields. |
Group |
|
|
Use the Advanced Zone Locator to extract data that is located in a consistent position across all documents in a class. For example, to extract data from structured documents such as forms. |
Group |
|
|
The Amount Group Locator works in partnership with the Invoice Group Locator and the Order Group Locator. The Amount Group Locator contains all of the fields that are related to, or make up, the total amount of an invoice. Not all of these fields need to be present on the document. Instead the system internally determines what fields are present, and all validations are based on the fields that are actually used. |
Group |
|
|
The Bar Code Locator extracts bar code data from documents. Some documents contain bar codes that are attached to a form during the scanning process as a stamp or sticker. Other documents contain bar codes as part of the document itself. The Bar Code Locator can locate the most common bar code types automatically, even if they are rotated or inverted. |
Single |
|
|
This locator method returns several fields that are commonly found on a check. A single locator can return data for multiple fields. Use this locator to exact the amount, date, payee, account numbers, check numbers, and other relevant check information. |
Group |
|
|
The Classification Locator uses the classification scheme defined in a secondary external Tungsten Transformation project. This project provides additional classification results for a document in the form of locator alternatives. Only the classification scheme is used from the external project. Use the Classification Locator to add information to a document that is obtained by using additional classification steps. Since the additional classification steps are normally independent of the main classification in the main project, an externally defined and trained project is used. The Classification Locator gives access to multi-view classification that sees the document from different aspects and multi-topic classification that returns more than one classification result for a document or a line of text. |
Single |
|
|
The Database Evaluator identifies a database record that matches the input data for a document. Unlike the Database Locator, that attempts to match a database record with recognition data from the full page or a smaller region, the Database Evaluator uses data from the results of other locators. |
Group |
|
|
The Database Locator matches the document with one or more records in a fuzzy database that are returned as alternatives. |
Group |
|
|
The format locator works with format definitions such as pattern matching (regular expressions and simple expressions) and advanced algorithms (Levenshtein and trigrams). The format definitions in partnership with dictionaries and keywords are used to extract data from documents, without the need to define zones. The locator runs on a full or partial page read of the document to extract the data using searches that are specific to the data, not the document layout. The locator evaluates the found alternatives and the data output. |
Single |
|
|
The Invoice Group Locator returns fields related to an invoice header. |
Group |
|
|
The Invoice Header Locator extracts the most commonly used data from an invoice. Because this locator does not depend on layout, you do not need to classify invoices by supplier and design different extraction schemes for each supplier. This locator method takes results from four other locators. This locator provides numbers, amounts, and dates, and extracts invoice header data, such as the invoice number, order date, total, and tax values. |
Group |
|
|
The Line Item Matching Locator uses processed documents along with information from your back-end enterprise resource planning (ERP) system to extract and match line items on an invoice or invoice-related document. This locator method uses technology that integrates a back-extraction feature that automatically extracts and matches invoice line items with purchase order data. |
Table |
|
|
The Named Entity Locator is used to assign extracted entities to fields by using the Natural Language Processing engine. This engine takes several seconds per page longer than regular recognition. This locator method extracts named entities such as a people, places, organizations, roles, times, amounts, etc. These named entities are found in unstructured, natural language text like sentences found in emails, documents, or even a report, to name a few. |
Table Single |
|
|
The OCR Voting Evaluator compares the results an Advanced Zone Locator and selects the best result to save to the field. |
Group |
|
|
The Order Group Locator returns the following fields related to the supplier or vendor; OrderNumber and OrderDate. |
Group |
|
|
The Relation Evaluator locates the best alternatives from one locator based on their position on the document in relation to the best alternative of another locator. |
Single |
|
|
The Script Locator uses a custom script event to locate data and raises a script event that enables users to define their own extraction results. The locator field in the XDoc is prepared based on the Script Locator settings, and can be a simple field or a group field. The locator field can be filled with alternatives in the event handler. How this is done is up to you. You can take results from other locators that are defined to precede this Script Locator or initialize the alternative with custom data. All alternatives are sorted by confidence after the script event. |
Group Single |
|
|
The Sentiment Locator uses the built-in functionality of the Natural Language Processing engine to extract text sentiments from a document. This means that the Sentiment Locator is able to determine the overall mood, or sentiment of a document based on the words and phrases found on a document. The Natural Language Processing engine takes several seconds per page longer than regular recognition. |
Single |
|
|
The Standard Evaluator compares the results from several different locators, and selects a set of results based on preset criteria. |
Single Table |
|
|
The Summary Locator uses built-in functionality of the Natural Language Processing engine to extract a summary of a document. This engine takes several seconds per page longer than regular recognition. |
Single |
|
|
The Table Locator finds data that is displayed on a document in the form of a table. One Table Locator can find tables matching one table model. |
Table |
|
|
This locator method finds data in unstructured documents that have no consistent layout. This enables you extract data from contracts, correspondence, or even essays and manuscripts. This locator works best for semi-structured documents, and is designed for documents that unstructured text made up of sentences. You can extract data from unstructured documents with moderate success, and the results should improve as you add more training documents. |
Group |
|
|
The Themes Locator uses built-in functionality of the Natural Language Processing engine to extract the theme or topic of a document. This information is then available for custom analysis via script. The Natural Language Processing engine takes several seconds per page longer than regular recognition. |
Table |
|
|
If you want quick and good generic extraction results with a relatively small training set, the highly optimizable Trainable Evaluator, is recommended. This evaluator is used to compare alternatives from other locators to determine which of those alternatives match a specific set of criteria. This evaluator relies on alternatives from the input locator. It also learns from false alternatives to improve training. |
Group |
|
|
The Trainable Group Locator is an all-purpose version of a trainable locator. The three other group locators specialize in invoice documents and related fields. The Trainable Group Locator can set up between 1 and 30 additional subfields not covered by the other group locators. |
Group |
|
|
The Vendor Locator detects and evaluates the vendor information from an invoice using results from a Database Locator combined with the results of other locators. The results provide information such as vendor ID, name, address, VAT ID, banking details and purchase order numbers, enabling the Vendor Locator to match them with database records and identify the vendor with a high level of confidence. |
Group |