New Samples Document Set
Use this document set to manage the documents that are returned for training during processing to improve the project.
Documents reported with problems are returned to Project Builder along with information for the project administrator about how those documents can improve classification or extraction.
Documents marked for Extraction Online Learning can be imported into the Extraction Set to improve the extraction quality of trained fields.
The New Samples document set contains three document subsets as follows:
- Problems
-
This subset lists new samples that were flagged as having a problem. These documents are imported into Project Builder by the project administrator and if they are relevant, they are edited and then used to train the project for classification and extraction . These samples can also be used to improve table extraction.
You can perform the following actions for these new samples:
-
Refresh to update the list of New Samples.
-
Add a document to the Classification Set for the selected class in the Project Tree.
-
Add a document to the Extraction Set for the selected class in the Project Tree.
-
Create a class and table locator for the selected document.
-
Mark a problem new sample complete once it is trained.
-
Filter Problems samples.
-
Sort Problems samples based on the available column entries.
-
Delete a Problems sample document.
-
- Extraction
-
This subset lists new samples that are used to train the project using Extraction Online Learning.
You can perform the following actions for the specific new samples:
-
Refresh to update the list of New Samples.
-
Import the Extraction New Samples into the Extraction Set.
-
Filter extraction samples.
-
Sort extraction samples based on the available column entries.
-
Delete an extraction sample document.
-
- Classification
-
This subset contains classification new samples that are used to train the project for Classification Online Learning.
You can perform the following action for the classification new samples:
-
Refresh to update the list of New Samples.
-
Import the classification New Samples into the Classification Set.
-
Filter specific samples.
-
Sort classification samples based on the available column entries.
-
Delete a sample document.
-
When the New Samples set is selected in the Documents window, several new toolbar settings are displayed that are available for this document set only.
However, for the Problems, Classification, and Extraction subsets, the List View is available only.
Neither of the other document views can be used with the New Samples.
The Convert New Samples Database shortcut menu setting is available only when there is an old New Samples database from a previous version of Tungsten Transformation Toolkit. This enables you to manually upgrade the New Samples database.
Related topics: