Convert HTML to Text
This data converter converts the HTML input text to plain text, and structures the text similarly to how it would appear in a browser.
Properties
The Convert HTML to Text data converter can be configured using the following properties:
- Include Aligned Tables and Images
-
Specifies that the tables and images that are aligned to the left or right of the text are included in the output text. Disabling this can sometimes result in removing the desired content.
- Include URLs
-
Specifies that the actual URLs in link tags will be included in the output text.
- Include Image Text Alternatives
-
Specifies that the text representation of images will be included in the output text.
- Include Form Fields
-
Specifies that the text representation of form fields will be included in the output text.
- Insert This Before a Heading
-
Specifies that this data converter should guess at the location of headings and insert the specified text before them.
- Insert This After a Heading
-
Specifies that this data converter should guess at the location of headings and insert the specified text after them.
- Keep Ampersand Encodings
-
Specifies that ampersand encodings will not be decoded. Text in script and style sheet will be respected.
- Description
-
Type in a description to be shown in the list of data converters. If there is no type in a description, one will be generated.