About Indexing PDF Documents

An index stores the content of many PDF files in a compact way, suited to easy search and retrieval.

index icon

Go to Index at Advanced Processing > Process and choose Create Full Text Indexes from the drop-down list to build a new index or update an existing one.

You can index PDF documents written in languages that use Roman characters or Asian characters (Chinese, Japanese or Korean). You can index not only the document text, but also bookmarks, comments, attachments, digital signatures, form fields, metadata, and other custom document properties.

You can build an index file from all the PDF files in a set of folders you define. Before starting you choose a folder where the index will be stored. Indexing proceeds in the background. A small index definition file is created, with the extension zpi. This refers to the index files that are stored in an automatically created sub-folder that has the same name as the zpi file, with a suffix _index.


These search indexes are not embedded in the PDF files; to make them available to other users you would have to save them to a shared location. Use a different command in the same drop-down list to create an embedded index for a single document so that it is truly portable.
 

Preparing for indexing

Collect all PDF documents to be indexed into one or more folders. If you just choose existing folders, be sure they include only PDF files you want indexed.

If you plan to migrate the PDF files with their index, it is better to store them in a single folder.

Add document properties to PDF documents so you can use them as search criteria.

Notes

Be aware that if you create a full text index before redacting a document to remove its sensitive information, that information is NOT removed from its index, and can be easily found. When redacting finishes you are invited to also remove document elements. Accept the offer and remove the index. We advise that redaction and inspection are better done on a copy of a document – this lets you retain the index in the original document.

Indexing hundreds of large PDF files can take time and computing resources – best done over a lunch break.