Import Extraction New Samples
Once you view the extraction new samples in the New Samples document set, you can add them to your Extraction Set if these documents can improve your extraction results. If a sample is not suitable for extraction training, remove it from the New Samples Extraction document subset.
You can import one or more extraction new samples by following these steps:
- Open the Documents window if it is not already open.
- Select the New Samples document subset.
-
Select the
Extraction document
subset.
A list of extraction samples is displayed.
- Because all extraction samples are imported at once, the best practice is to review and remove any documents that are not suitable for extraction training.
-
On the
Documents toolbar, click
Import Documents from Extraction
Online Learning
The Import Extraction Online Learning Data window is displayed and shows the current path to the online learning files for the project.
-
Enter a name in the
Import into training
subset field.
The best practice is to create a new document subset for each imported set of new sample documents. This is because it ensures that you are able to differentiate, test, and benchmark your training set to determine whether a specific set of new samples improves or hinders your extraction results.
-
Click
OK to save your settings
and close the
Import Extraction Online Learning Data window.
The Extraction subset in the New Samples database is emptied and all of the extraction samples and they are moved to the Extraction Set under the document subset specified in the previous step.
- Train your project for extraction.
- Optionally, perform an extraction benchmark and compare it to a previous benchmark that does not include the newly added sample documents.