Enable Extraction Online Learning
If you want to improve your extraction results even while your project is in production, use Extraction Online Learning. Documents are collected during production so you should see results in subsequent batches without any changes to the project itself.
Since the documents are collected during production, a large number of accumulated documents can adversely affect project performance. For the best results, ensure that you follow the project life cycle recommendations to store your training documents in the most efficient manner.
You can enable Extraction Online Learning by following these steps:
- Enable online learning.
- On the Project tab, in the Configuration group, select Project Settings .
-
On the
General tab, in the
Online Learning, click
Advanced.
The Advanced Online Learning Options window is displayed.
-
In the
General Settings group, select
Use Extraction Online
Learning
The Extraction Online Learning Settings group is enabled.
-
Optionally, if you do not want to use a dynamic specific knowledge
base, clear
Use dynamic knowledge base during
extraction.
By clearing this option, the project administrator has to manually import and review the documents marked for Extraction Online Learning and train the project before any improvements to extraction are made. For the best results, do not clear this option.
-
Optionally, if you do not want
Validation
users to mark documents and you do not want to write a complicated
script to determine which documents to mark for online learning, select
Automatic training after
Validation.
Selecting this option will mark documents that qualify for online learning automatically.
In order for documents to be collected automatically for online learning, you have to configure fields so that they are monitored in the Field Details window. You can only monitor fields that are assigned to a trainable locator field.
When both of these options are selected, a document is marked for Extraction Online Learning when the Maximum documents stored for import value has not yet been reached, the Layout ID for the document still requires training documents, and one of the following occurs:
-
The confidence of an extracted field is below a certain confidence level.
-
Field coordinates were changed manually during Validation
-
-
Optionally, adjust the
Maximum documents stored for
import value to restrict the number of training documents
collected.
Enter a value between 100 and 20,000 documents.
-
When finished, click
OK.
The Advanced Online Learning Options window is closed and your changes are saved.
- Optionally, click OK to close the Project Settings window.
- Save the changes to your project.