PdfTextDocument Class |
Namespace: Atalasoft.Pdf.TextExtract
The PdfTextDocument type exposes the following members.
Name | Description | |
---|---|---|
PdfTextDocument(Stream) |
Initalizes a new instance of the PdfTextDocument class.
| |
PdfTextDocument(String) |
Initalizes a new instance of the PdfTextDocument class.
| |
PdfTextDocument(Stream, String) |
Initalizes a new instance of the PdfTextDocument class.
| |
PdfTextDocument(String, String) |
Initalizes a new instance of the PdfTextDocument class.
|
Name | Description | |
---|---|---|
ExtractionGranularity |
Gets the extraction granularity provided by this document.
| |
OutputLineEnd |
Gets or sets a flag indicating whether line end symbols should returned by PdfTextReader.
| |
OutputPageEnd |
Gets or sets a flag indicating whether a page end symbol should returned by PdfTextReader.
| |
PageCount |
Gets the document page count.
|
Name | Description | |
---|---|---|
Dispose | Releases all resources used by the PdfTextDocument | |
Dispose(Boolean) | Releases the unmanaged resources used by the PdfTextDocument and optionally releases the managed resources | |
DisposePages |
Disposes all pages in a cache.
| |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetPage |
Retrieves a PdfTextPage from document.
| |
GetPdfTextReader |
Creates a new PdfTextReader for all pages in document.
| |
GetPdfTextReader(Int32) |
Creates a new PdfTextReader for specified page.
| |
GetPdfTextReader(Int32, Int32) |
Creates a new PdfTextReader for specified pages range.
| |
GetTextPage |
Gets the ITextPage at the specified index.
| |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
Initialize | Obsolete. | |
MakePages |
Initializes a pages cache.
| |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
ToString | Returns a string that represents the current object. (Inherited from Object.) |
using Atalasoft.Pdf.TextExtract; using System.Drawing; using System.IO; // some examples of using the classes in the // Atalasoft.Pdf.TextExtract namespace // get the number of pages in a PDF public int GetPageCount(Stream s) { using (PdfTextDocument doc = new PdfTextDocument(s)) { return doc.PageCount; } } // get the number of characters on a page in a PDF public int GetCharCount(Stream s, int pageNum) { using (PdfTextDocument doc = new PdfTextDocument(s)) { PdfTextPage textPage = doc.GetPage(pageNum); return textPage.CharCount; } } // Extract Text from a PDF public String GetText(Stream s, int pageNum, int index, int count) { using (PdfTextDocument doc = new PdfTextDocument(s)) { PdfTextPage textPage = doc.GetPage(pageNum); return textPage.GetText(index, count); } } // Find out where a character is in a PDF public PointF GetCharPos(Stream s, int pageNum, int index) { using (PdfTextDocument doc = new PdfTextDocument(s)) { PdfTextPage textPage = doc.GetPage(pageNum); return textPage.CharOrigin(index); } }
Imports System.Drawing Imports System.IO Imports Atalasoft.Pdf.TextExtract ' some examples of using the classes In the ' Atalasoft.Pdf.TextExtract Namespace ' get the number of pages in a PDF Public Function GetPageCount(ByVal s As Stream) As Integer Using doc As New PdfTextDocument(s) Return doc.PageCount End Using End Function ' get the number of characters on a page in a PDF Public Function GetCharCount(ByVal s As Stream, ByVal pageNum As Integer) As Integer Using doc As New PdfTextDocument(s) Dim textPage As PdfTextPage = doc.GetPage(pageNum) Return textPage.CharCount End Using End Function ' Extract Text from a PDF Public Function GetText(ByVal s As Stream, ByVal pageNum As Integer, ByVal index As Integer, ByVal count As Integer) As String Using doc As New PdfTextDocument(s) Dim textPage As PdfTextPage = doc.GetPage(pageNum) Return textPage.GetText(index, count) End Using End Function ' Find out where a character is in a PDF Public Function GetCharPos(ByVal s As Stream, ByVal pageNum As Integer, ByVal index As Integer) As PointF Using doc As New PdfTextDocument(s) Dim textPage As PdfTextPage = doc.GetPage(pageNum) Return textPage.CharOrigin(index) End Using End Function