Text Content Locator

Text Content Locator icon This locator method finds data in unstructured documents that have no consistent layout. This enables you extract data from contracts, correspondence, or even essays and manuscripts. This locator works best for semi-structured documents, and is designed for documents that unstructured text made up of sentences. You can extract data from unstructured documents with moderate success, and the results should improve as you add more training documents.

You can use this locator to extract data from machine-printed forms with some success. However, other locator methods such as the Advanced Zone Locator or the Format Locator are more successful in extracting data from forms.

This locator needs training documents to teach the locator how to find the necessary data. Unlike other locators, you configure this locator by adding documents to your Extraction Set, lassoing the needed content, and training your project.

For the best results, the more training documents you have, the better. However, as you increase the number of training documents, the time to train your project also increases. Test your extraction results regularly after training your project to ensure that you are not adding documents without a positive effect on the results.

Manage the Text Content Locator as follows:

  • Add Text Content Locator subfields

  • Map Text Content Locator subfields

  • Rename Text Content Locator subfields

  • Delete Text Content Locator subfields

  • Configure the Text Content Locator

  • Train the Text Content Locator

Important This locator method supports content that spans multiple lines. However, the Edit Document window does not support lassoing content that spans multiple lines.

The Properties of Text Content Locator window has the following tabs: