Classification New Samples and the Project Life Cycle
When a project is first put into production, its classification results are not ideal. More training documents are accumulated when a project is first published and fewer documents get collected the longer your project is in production. This is because fewer and fewer documents go unclassified. As a result, certain tasks need to be performed more often for new projects.
If you have the "Use dynamic classifiers during classification" option enabled on the Advanced Online Learning Options window, training documents are used during production to improve classification. This minimizes the number of reclassified documents and therefore lowers the number of training documents accumulated.
If you do not have the "Use dynamic classifiers during classification" option enabled, the collected documents have no effect until the documents are imported and the project is retrained and then published.
As classification training documents are accumulated when dynamic classification is enabled, they sit in the Dynamic Classifiers while a copy is saved in the New Samples Classification subset.
When dynamic classification is not enabled, your accumulated documents are only sitting in New Samples and are not benefitting your project.
In order to ensure that you move your training documents into your Classification Set on a regular basis, import your Classification New Samples and train your project after the following time intervals:
-
After one week
-
After two weeks
-
After three weeks
-
After four weeks
-
After two months
-
After three months
-
After six months
-
After one year
After one year, your project should be successfully processing documents without many problems. By now, the only time training documents are collected is when a new vendor or form is encountered. Continue to monitor your project and import the documents and re-train your project, every six months or so.