Apply field formatting and validation to extraction fields

Use the procedure in this topic to manually apply formatting and validation to fields on both configurable and predefined document types. For configurable document types, the fields must exist before you can apply formatting and validation.

Both formatting and validation are applied to field data when extraction is performed for training documents and is tested for test documents. The formatted value is returned to the list of field extraction values. The original value before formatting is available in the field details displayed when you move the pointer over the extraction data circle or the field on the document image.

If validation fails, the validation message is also displayed in the field details.

Some predefined document types already have formatting and validation. Most date fields are formatted for the document type locale. Dates on international documents are formatted to YYYY-MM-DD, and dates on USA documents are formatted to MM-DD-YYYY. To see what other fields have formatting, view the field details. If the original value differs from the extraction result on the Fields tab, formatting is applied to the extraction result.

If you apply formatting to a date field on a predefined document type that is already formatted, your formatter uses the already formatted value as input, not the original extracted value. For example, to format the StatementDate on a USA - BankStatements document so it outputs the date as MM/DD/YYYY instead, the date formatter must include the MM-DD-YYYY format in its "Ambiguous date formats" setting. Otherwise, the formatter does not recognize the input format and returns an error.

Before you begin

Before you apply field formatting and validation to an extraction field, verify that the following steps are complete.

  • Configurable document types have one or more fields already configured.

  • If your solution contains predefined document types only, extraction is enabled.

  • Field formatters and field validators are configured, if needed. For more information about adding formatters and validators, refer to the Tungsten TotalAgility Designer Help.

Procedure

  1. On the Fields menu, select a document and then select List of configured fields.

    The Configured fields window is displayed for the selected document. The window is divided as follows:

    • The list of configured Fields appears on the left.

    • Properties of the selected field appear on the right.

  2. Select a field from the list of fields.
  3. Optional. Modify the Name of a field.

    This is the name that is visible for this field during production. Changing this name does not affect extraction, only what users see during production.

  4. Optional. Add additional Keywords. Separate multiple keywords with a semicolon (;).
  5. Select the General tab and configure the following settings.

    Is bar code

    This setting is available only if a bar code is lassoed. Suggested fields do not support bar codes. (Default: Cleared)

    If selected, the field is assumed to be a bar code, and the following setting is enabled:

    Bar code type

    When extracting a bar code, select the specific bar code type from the list to ensure the best extraction results. If the bar code type is known when the bar code is lassoed, the type is selected automatically. If the bar code type is unknown, use Select All.

    Type

    Select the type of field that you are creating. (Default: Text)

    There are three values for this setting.

    • Text

    • Date

    • Number

    The selected field type determines additional settings.

    Formatter

    It is possible to further refine a field by applying a formatter. (Default: No formatter)

    By default, the following formatters are available:

    • No formatter: If selected, it does not take any text, characters, or digits from the document for formatting.

    • Default Amount Formatter: Contains the default currency and typical decimal symbol formatting.

    • Default Date Formatter: Contains basic date formatting, such as the date order and date output format.

    When the field Type is set to Date, the Default Date Formatter is selected automatically.

    Field formatting is applied to field data when extraction is performed for both training documents and test documents. The original value is displayed when you move the pointer over the extraction data circle or the field on the image.

    Is mandatory

    Select this setting if a value for this field is required during production. (Default: Cleared)

  6. The following settings depend on the selected value for Type. Follow the steps for the relevant value from the table below.

    Text

    Configure the following settings as required.

    Minimum character length:

    The minimum number of characters allowed for a field. Select to enter a minimum character length. If an extracted result is shorter than the minimum length, the field is marked as invalid. (Default: Cleared)

    Maximum character length:

    The maximum number of characters allowed for a field. Select to enter a maximum character length. If the extracted result is longer than the maximum length, the field is marked as invalid. (Default: Cleared)

    Define allowed characters:

    Select to enter a list of allowed characters. This list is literal, so no separator character is necessary. (Default: Cleared)

    Define restricted characters:

    Select to enter a list of restricted characters. This list is literal, so no separator character is necessary. (Default: Cleared)

    Date

    Configure the following settings as required.

    Reference date:

    Select a reference date. All dates are compared to this date to ensure that they fit within a specific date range. (Default: Today)

    Period before reference date:

    Select this setting to provide the number of days that a date is valid before the reference date. This setting restricts a date found on a document to be within the past N days before the reference date. If a date falls outside the specified date range, it is invalid. (Default: Cleared)

    Period after reference date:

    Select this setting to provide the number of days that a date is valid after the reference date. This setting restricts a date on a document to be within N days after the reference date. If a date falls outside this range, it is invalid. (Default: Cleared)

    Number

    No additional steps are necessary for fields of this type.

  7. Select the Validation tab and configure the following settings.

    Require manual field confirmation

    If selected, this field is flagged as invalid and requires manual confirmation. (Default: Cleared)

    Always valid

    If selected, this field is always valid, even when no result is extracted. (Default: Cleared)

    Validators

    A list of validation rules applied to this field. If the extraction result does not meet these rules, it is invalid and the configured validation message is displayed.

    For more information about configuring validation rules, refer to the Tungsten TotalAgility Designer Help.

    Field validation is applied to field data when extraction is performed for both test documents and training documents. If an extracted field does not meet the validation requirements, a validation message is displayed when you move the pointer over the field on an image or its extraction data circle.

    The color of the extraction data circle does not indicate if there are validation errors. It is possible to have a green extraction data circle and an extraction result that fails validation. After extraction or testing, move the pointer over the circle to see if there is a validation message.

  8. Select Save.

    Extraction is performed for your training document and all configured formatting and validators are applied.

    The tooltip for the extraction data circle or the extracted value on the image is available to view details about the extraction. If formatting is applied to a field, the original extraction value is included in this information, as is a validation message, if appropriate.

    If you apply formatting and the result is not as expected, test the field formatter using the extracted value as input. The test should succeed. If it fails, review the formatter settings and update if necessary.

  9. Optional. Apply formatting and validation for other extraction fields in the current or another document type.
  10. Optional. To delete a field, open the List of configured fields, choose a field and select Delete. Select Save to close the window.

    The field is removed from the list of fields.

  11. Repeat steps 2 to 10 to apply validation and formatting to more fields on this and other document types.
  12. Optional. Save your solution.

Next steps