Regular Expressions
Regular expressions are used to recognize patters with textual data. They evaluate text data and match an expression with the text in the document. In Kofax TotalAgility, regular expressions are used in format locators, validation methods, and formatters, to identify and normalize items on a document.
Regular expressions describe data in an abstract way, and some common examples are listed in the following table:
Format | Description | Example | Matches | Does Not Match |
---|---|---|---|---|
C |
One character |
a |
a |
b,A |
. (period) |
Any character |
b.g |
bug, bag, big, bbg |
bg, baag |
\d |
Any single digit |
a\d |
a5, a8, a0 |
aA, ab, a |
c1c2c3 |
One character out of a set |
[abc] |
a, b, c |
1, 2, d, D, A, ab, bc |
[c1-cn] |
One character out of a range |
[a-z] |
b, g, x |
1, 2, D, A |
? (question mark) |
The previous term is optional |
x\d? |
x, x7, x1 |
xx, xq |
+ (plus sign) |
The previous term can be repeated one or more times |
\d+ |
4, 2323, 100 |
A112, 2b, X |
* (asterisk) |
The previous term can be repeated zero or more times |
x\d* |
x6, x, x100 |
100x, xx |
{n} |
The previous term can be repeated exactly n times |
y{3} |
yyy |
yy, yyyy |
{m, n} |
The previous term can be repeated between m and n times |
\d{5,9} |
12345, 999999999 |
1234, 999999999999 |
\ |
Escape special characters |
\$ \\ \- \? \. |
$ \ - ? . |
!% |
() |
Group characters |
a(\$\$)?b |
a$$b, ab |
a$b, a$$ |
(e1|e2) |
Choice |
(abc|ABC) |
abc, ABC |
aBC, AbC |
\n |
Back reference (nth item matched in round brackets needs to be matched again) |
(\d)x\1 |
1x1,2x2,3x3,4x4... |
1x2,6x7... |
You can find many third-party resources on the internet about regular expressions. In many cases however, extensive knowledge of regular expressions is not needed because Kofax TotalAgility provides a set of commonly used and predefined templates.