Import Settings for Database (Text File Source) window
Use this window to configure the import settings for a Fuzzy Database that is based on a delimited text file (CSV).
- Source File
-
This group has the following settings:
- Upload file manually to the server
-
Select this setting in order to define a source file for the import that is not accessible by the Kofax Search and Matching Server. Type the file name (file path and file name) or select the file in the Open window. Click Upload to stream the file to the server and to display a table preview so that you can set columns to use and other import options.
For an uploaded file you cannot define an automatic update. Note that you have to upload the source file manually if it has changed, for example, if new records are added or columns are added or removed.
- Server accesses file from URL/UNC location
-
Select this setting if the Kofax Search and Matching Server has all needed rights to access and read the configured file. Type the file name (file path and file name) or select the file in the Open window. A table preview is displayed so that you can select used columns and other import options.
- Column Configuration
-
This group has the following settings:
The group provides a table that contains a list of database columns and then two check boxes for Search and Filter.
Select Search if you want to include that column in database searches. Select Filter if you want to use a column for filtering. Filtering is available for remote fuzzy databases only.
To rename a column, select it and then click it a second time to get an editable column name. As databases often contain additional columns such as internal customer IDs or contact names, these are not displayed on the document so you need to take considerations for the creation of the fuzzy database into account. By default, all available database columns are selected.
- Import Options
-
This group has the following settings:
- Ignore Case
-
Select this option to convert all search and lookup strings to lower case, effectively ignoring case. This is the default setting.
- First line contains caption
-
Select this setting if the first record of the input file contains the column headers. These names are used as field names in the database locator. If the database was referenced before you selected this setting, you can select the setting afterward. The database entries are updated automatically. By default, this setting is selected.
- Database active after import
-
Select this setting for a newly-configured database so that the database is activated automatically after the import. Note that this setting is displayed for new databases only and is hidden if you open the Import Settings window the next time.
- Filtering is case sensitive
-
If selected, searches are case sensitive when filtering is used.
- Normalize half/full-width Katakana characters
-
When selected, half-width Katakana and Hangul Unicode characters are converted to their corresponding full-width characters. Similarly, full-width ASCII/Latin Unicode characters are replaced with their corresponding half-width characters, which is how these characters are typically displayed on western documents.
This conversion occurs when the characters are found on a document or in a database.
This setting is cleared by default.
- Handle CJK characters as single word
-
When selected, Chinese, Japanese, and Korean characters are handled individually as a single word. This is because these languages do not have a delimiter such as a space, to separate words.
This setting is cleared by default.
- Field delimiter
-
Type values into this field to specify what characters separate the import file content into individual fields. The value for this setting is set to ; (semicolon) by default.
- Tab
-
Select this checkbox to use a Tab as a delimiter in addition to the characters specified in the Field delimiter setting.
- Word separation characters
-
If fields in the database contain compound words, common characters can be specified so that each part of the compound word is searched and evaluated separately. The value for this setting is set to " -," (space, hyphen, comma) by default.
For example, using the default settings, the compound word "Diagon-Alley," is treated as two words, "diagon" and "alley" that are searched and evaluated separately.
The separation characters must correspond to the delimiter characters that are defined for OCR.
- Tab
-
Select this checkbox if you want to use a Tab as a word separation character in addition to the characters specified in the Word separation characters setting.
- Space
-
Select this checkbox if you want to use a Space as a word separation character in addition to the characters specified in the Word separation characters setting.
- Characters to ignore
-
Type a list of characters into this field to filter unwanted characters from the input record. When you want to use a field delimiter that may also be a character in the input, such as a comma (,), then you have to use quotes (") to identify the input strings. However, you probably do not want to retain those quotation marks as part of the final results.
If you define the quotes as characters to ignore, they are removed. To define a tab or space as characters to ignore, select the corresponding check box.The value for this setting is set to ."'! (period, quotation mark, single quotation mark, and an exclamation point) by default.
- Space
-
Select this checkbox if you want to ignore a Space character in addition to the characters specified in the Characters to ignore setting.
- String Substitution
-
You can use the following buttons to manage String Substitution that is used to substitute search text with replacement text in the document and in the database:
- Add
-
Click Add
to add a new string substitution pair to the list of substitutions.
- Delete
-
Click Delete
to delete the selected item in the substitution list.
- Modify
-
Click Modify to modify the selected item in the substitution list with the values from the two edit fields.
- Import
-
Click Import to import a file that contains a list of string substitutions. For the import you can either add the string substitutions to your current list of substitutions or replace them by the imported list. You can import string substitutions that are exported from a Kofax Search and Matching Server database as well as from a Kofax Transformation Modules Project Builder or a Kofax TotalAgility Transformation Designer database.
- Export
-
Click Export to export your string substitution pairs to a text file. You can then import the string substitutions to be used for other Kofax Search and Matching Server or Project Builder databases.
- Table Records Preview
-
The Table Records Preview area provides a read-only preview of the first 20 records of the referenced table. Note that columns that are not selected for use are grayed out.
In addition to the common buttons, the following button is provided at the bottom of the window:
- Import Database
-
Creates the fuzzy database according to the defined import settings.