Guidelines for descriptions

The description is used to find a piece of data in a document. Therefore, you must take care when creating a field to ensure that the description locates the required information.

Best practices for descriptions

Use the following guidelines to ensure the best results:

  • Use simple words and sentences to describe the required data; do not give instructions or ask questions.

  • Be as specific as possible.

  • Avoid ambiguous statements that could match multiple items.

  • If there is a list of expected values, include them within the description.

  • If you want to format the result, include that in the description. Also, explain what to do when no result if found.

  • Avoid extracting data that is not related to the document.

Description limitations

When working with AI, be aware that you must provide as much context as possible to ensure accurate results.

The following limitations apply to Copilot Auto Extract:

  • It cannot do mathematics. If you ask Copilot Auto Extract to perform any mathematical calculations, you will not get consistent or reliable responses.

  • It cannot count. Copilot Auto Extract cannot find a word if you say "It is in the first 30 words on the page."

  • It does not know the current date. Any date comparisons do not work.

  • It does not know any pixel coordinates on a document. You cannot use references such as upper right corner, directly above, below, etc.

  • It cannot reference external resources as input for a field. You cannot point to a website and expect Copilot Auto Extract to know how to read that website and return content from the URL.

  • It cannot make a decision if the field description returns two pieces information. Only the first piece of information is returned. In this situation, be more specific in your description or create two separate fields.

Examples of descriptions

For a field called FirstName:

Do explain what you want in a simple way such as The first name of the person who submitted the form or The first name of the sender.

Do not be ambiguous and say something like The first name on the form. Since descriptions are interpreted literally, this might return the first name found on the document, rather than a specific person's first name. This is especially true if there is more than one person mentioned on a document.

Good field description examples

Name

Description

SSN

Social Security Number of the person submitting the form.

UtilityBillType

Type of utility invoices, such as Gas, Power, or Water.

Shipping Address

Address where the goods are shipped.

Total

Total amount invoiced.

Payment Recipient

Name of the person receiving the payment.

Due Date

Due date of the bill in the format of YYYY/MM/DD.

Bad field description examples

Name

Bad Description

Why the description is bad

Approve

Yes or No, whether this invoice should be approved.

This information cannot be inferred from the text of a document. The LLM cannot know your internal approval processes.

PreviousBalance

The pervious balance of the bank account.

This may not work if there are multiple bank accounts in this statement.

Be more specific and add of the first bank account mentioned to the description.

LargestAmount

The largest dollar amount that occurs on the document.

LLMs may not do mathematics. We cannot guarantee that this will work.

IsAdult

Yes, if the person submitting the form is an adult based on the date of birth; otherwise, No.

LLMs may not do mathematics, nor do they know the current date. Both are needed to derive the age.

Name

Who submitted the form?

What's the name of the person who submitted the form?

Return the name of the person who submitted the form.

This is a question, rather than a description of the required data.

This is also a question, rather than a description of the required data.

This is an instruction, rather than a description of the required data.

In many cases, it may require a bit of experimentation to get the best results. Modify the wording of your description to see if the results are improved.