Troubleshooting
Primary rule (voc, re, or cvoc) did not match or returned false positives
A trace option exists for FormProcApp.exe (-d <field name> <compound index> <rule index>) that populates OPFD.log with match/nonmatch records for examined words:
guide> FormProcApp.exe Receipt.fp Example3 Example3 Example3\vendor3.yml -l en-US -d Vendor 0 0
OPFD.log reveals the text fragments examined and the pattern the rule matched, even fuzzily, and the word it matched:
[2024-04-01 11:08:55] Info DumpRuleDebugs - 296:
Debugging the requested rule in field:
…
-: 5111ICE 1977
^^^^...$ <-- cvoc non-match
-: Answers For Every Bodr
^^^^^^^^^^^^^^^^^^^^^^ <-- cvoc non-match
+: Vitamin Shoppe #177
^.................$ cvoc: USA_CompanyDictionary => Vitamin Shoppe
Vitamin Shoppe #177
+: Vitamin Shoppe values your feedback. *
^....................................$ cvoc: USA_CompanyDictionary => Vitamin Shoppe
…
+: Vitamin Shoppe Gift Card.
^.......................$ cvoc: USA_CompanyDictionary => Vitamin Shoppe
Vitamin Shoppe Gift Card.
More information is available when the error: 1 option is removed from Vendor3.yml.
Regex rule did not capture what I wanted
Usually, the issue lies with the regex pattern. To verify the pattern, users can filter the raw word list which always included in the output.
The output .json contains OCR results as-is under .fragments, what is fed to the regex/voc rule. To manually verify the user-written regex pattern, an external tool can be fed by the .json output.
Example command line in PowerShell:
guide> jq -r '.fragments[].words[].text' Receipt_total1_0.json |Select-String -Pattern '(SUB)?TOTAL:*'
Subtotal
Total
The same can be done with grep under Unix-like systems or findstr in a DOS prompt.
Care must be taken with special characters, especially backslashes, and local accent marks. We use the regex engine srell, which supports UTF-8 characters and deals with the upper and lower case transformation of local accents.
If a rule is a compound rule, users can elaborate by temporarily commenting out the other parts of the rule.
Error messages in the output JSON file
When encountering an environmental error, an error appears in the output .json. For example, a bad input .yml was used here:
{
"title": "Unknown",
"version": "1.0",
"lang": "en-US",
"input": "Receipt.fp",
"page": -1,
"errors": [
{
"severity": "Fatal",
"id": "Fatal",
"text": "Invalid rule in 'amount-anchor'. The referenced vocabulary 'TOTAL' does not exist."
}
]
}
The return code of function OPFP_Process is not 0 in this case, but we ensure that the error is stored in a structured way, in addition to OPFP.log, which is unstructured.
OPFP.log
Both OPFP_Init and OPFP_Process runs create at least one OPFP.log file. OPFP_Init places a log file on the path specified by the home_path parameter. Additionally, each OPFP_Process call can specify a different output path in the output_dir parameter.
-
If an OPFP_Process call use the same output_dir parameter (for example, using the "." realative path), or it is not specified then it appends the log output to the previous logfile.
-
If the output_dir parameter changes between calls, then OPFP creates a log file at the output path. This way calls can place log files in different folders. Existing log files are overwritten in the order of the OPFP_Process calls.
In case of such a general failure (for example, input missing, .yaml parsing error) when no output has been generated, it is recommended to look into OPFP.log where human readable form of the error should be shown.
In case you have a real business use case and Troubleshooting (see Primary rule (voc, re, or cvoc) did not match or returned false positives for details) did not work, we recommend sending the log files to the support.
YAML parse failure
If the following is shown in the OPFP.log and the error does not make sense, it is recommended to open a web-based .yaml checker. It is generally useful to copy and paste your development phase .yaml code first to a checker before running FormProc.
[2024-03-04 15:19:18] Fatal - 0:
Error code: Fatal - 'Error while parsing YAML input . . .