Train the Text Content Locator
Before training the Text Content Locator, add Text Content Locator subfields and class fields to your project. Once you finish adding fields, add training documents and train your project.
You can train the Text Content Locator by following these steps:
- Open the Project Tree window if it is not already open.
- Expand the Project Tree and select the class.
-
Optionally,
view the class contents if they are not
already displayed.
The hidden class contents are displayed.
- Open the Documents window if it is not already open.
- If a different view is in use, switch to the List view .
-
Add or open a document set that contains the
documents to use to train the
Text Content Locator.
A list of documents is displayed.
-
Double-click a document.
The Document Viewer is displayed showing the selected document.
-
If the document is suitable for training, select the document in
the
list view of the
Documents window and select
Train for Extraction
from the menu.
The Edit Document window is displayed showing the first of the selected documents.
-
In the Edit Document window, select a field and lasso the
corresponding content in the document.
The lassoed data is entered into the field.
-
Repeat lassoing for each field and when finished, click
Add to Training Folder
.
The document is added to your Extraction Set and the next document in your document set is loaded automatically.
-
Continue to add training documents until you are ready to test
your training set, and
Close the
Edit Document window.
For the first iteration of testing, add a minimum of 3 to 5 training documents per class before training and testing your project. Using the smallest number of training documents can return reliable extraction results. Also, these few documents may return adequate results, so you do not need to add more documents.
-
On the
Process Ribbon tab, in the
Train group, click
Extraction
.
The documents in your training set are trained and a progress bar lets you view the progress.
-
In your test document set, right-click one or more selected
documents, and click
Process.
The document is classified and extracted.
-
Open the
Extraction Results window if it is
not already open.
The extraction results are displayed. Invalid fields have a blue question mark and valid fields have a green check mark.
-
In the
Extraction Results window, view the
Text Content Locator results based on your
training documents.
If the results are not satisfactory, add training documents to your Extraction Set by repeating steps 7 through 14.