Regular expressions
Regular expressions are used in text format validation form fields, document fields, and password formats.
TotalAgility provides several predefined regular expressions. However, you can define custom regular expressions to use in forms and validators.
Regular expressions are used to recognize patterns with textual data. They evaluate text data and match an expression with the text in the document. In Kofax TotalAgility, regular expressions are used in format locators, validation methods, and formatters, to identify and normalize items on a document.
Regular expressions describe data in an abstract way, and some common examples are listed in the following table.
Format |
Description |
Example |
Matches |
Does Not Match |
---|---|---|---|---|
C |
One character |
a |
a |
b,A |
.(period) |
Any character |
b.g |
bug, bag, big, bbg |
bg, baag |
\d |
Any single digit |
a\d |
a5, a8, a0 |
aA, ab, a |
c1c2c3 |
One character out of a set |
[abc] |
a,b,c |
1, 2, d, D, A, ab, bc |
[c1-cn] |
One character out of a range |
[a-z] |
b,g,x |
1, 2, D, A |
? (question mark) |
The previous term is optional |
x\d? |
x, x7, x1 |
xx, xq |
+ (plus sign) |
The previous term can be repeated one or more times |
\d+ |
4, 2323, 100 |
A112, 2b, X |
* (asterisk) |
The previous term can be repeated zero or more times |
x\d* |
x6, x, x100 |
100x, xx |
{n} |
The previous term can be repeated exactly n times |
y{3} |
yyy |
yy, yyyy |
{m, n} |
The previous term can be repeated between m and n times |
\d{5,9} |
12345, 999999999 |
1234, 999999999999 |
\ |
Escape special characters |
\$ \\ \- \? \. |
$ \ - ? |
!% |
() |
Group characters |
a(\$\$)?b |
a$$b, ab |
a$b, a$$ |
(e1|e2) |
Choice |
(abc|ABC) |
abc, ABC |
aBC, AbC |
\n |
Back reference (nth item matched in round brackets need to be matched again) |
(\d)x\1 |
1x1,2x2,3x3,4x4... |
1x2,6x7... |
You can find many third-party resources on the internet about regular expressions. In many cases, however, extensive knowledge of regular expressions is not needed because Kofax TotalAgility provides a set of commonly used and predefined templates.
If any of your documents have special ASCII characters that you want to locate and extract, you can do so using regular expression codes for ASCII characters. The following table shows the conversion requirements.
ASCII Hex |
Special Character |
Regular Expression Code |
---|---|---|
21 |
! |
\x21 |
22 |
" |
\x22 |
23 |
# |
\x23 |
24 |
$ |
\x24 |
25 |
% |
\x25 |
26 |
& |
\x26 |
27 |
' |
\x27 |
28 |
( |
\x28 |
29 |
) |
\x29 |
2A |
* |
\x2A |
2B |
+ |
\x2B |
5E |
^ |
\x5E |
A7 |
§ |
\xxA7 |
For example, a single entry of \x2A can be used to match a single character. In this case, an asterisk (*). In addition, you can use these characters as a range. For example, [\x21-\x29] can locate any of the following characters; !"#$%&'().
How to: Manage regular expressions