public final class TextExtractor extends Object implements IPdfTypeExtractor
| Modifier and Type | Field and Description |
|---|---|
com.aspose.ms.System.Collections.Generic.Dictionary<Integer,com.aspose.pdf.groupprocessor.Page> |
_numberedPages |
| Constructor and Description |
|---|
TextExtractor()
Creates TextExtractor instance.
|
| Modifier and Type | Method and Description |
|---|---|
long |
buildProperties(com.aspose.pdf.groupprocessor.ByteRange range,
com.aspose.pdf.groupprocessor.PdfTreeNode parentNode)
Builds tree of nodes those contain all pdf parameters with their values.
|
long |
buildProperties(com.aspose.pdf.groupprocessor.ByteRange range,
com.aspose.pdf.groupprocessor.PdfTreeNode parentNode,
boolean extractJustValue)
Builds tree of nodes those contain all pdf parameters with their values.
|
void |
dispose()
Dispose object
|
String[] |
extractAllText()
Extracts text from the document
|
String[] |
extractAllTextInternal() |
String |
extractPageText(int pageNumber)
Extracts text from the page
|
int |
getPageCount()
Gets count of pages in the document.
|
String |
getVersion()
For Internal usage only
|
void |
initialize(String pdfDocumentPath,
int bufferSize,
boolean allowAsyncInitialization)
Initializes TextExtractor instance.
|
void |
initializeAlternative(String pdfDocumentPath)
Initializes TextExtractor instance.
|
boolean |
isFastExtractionUsed()
Returns TRUE if the fast extraction was used
|
public final com.aspose.ms.System.Collections.Generic.Dictionary<Integer,com.aspose.pdf.groupprocessor.Page> _numberedPages
public void initialize(String pdfDocumentPath, int bufferSize, boolean allowAsyncInitialization)
Initializes TextExtractor instance.
pdfDocumentPath - Path to a pdf document.bufferSize - Maximum size of content in bytes that can be kept in memory.allowAsyncInitialization - Allows async initialization of resources.public void initializeAlternative(String pdfDocumentPath)
Initializes TextExtractor instance.
pdfDocumentPath - Path to a pdf document.public long buildProperties(com.aspose.pdf.groupprocessor.ByteRange range,
com.aspose.pdf.groupprocessor.PdfTreeNode parentNode)
Builds tree of nodes those contain all pdf parameters with their values.
range - Byte range where to parse parameters.parentNode - Initial (root) node for building tree.extractJustValue - For recursive calling.
Just shows that next recursive function should find parameter value but not parameter itself.public long buildProperties(com.aspose.pdf.groupprocessor.ByteRange range,
com.aspose.pdf.groupprocessor.PdfTreeNode parentNode,
boolean extractJustValue)
Builds tree of nodes those contain all pdf parameters with their values.
range - Byte range where to parse parameters.parentNode - Initial (root) node for building tree.extractJustValue - For recursive calling.
Just shows that next recursive function should find parameter value but not parameter itself.public String[] extractAllText()
Extracts text from the document
extractAllText in interface IDocumentTextExtractorextractAllText in interface IPdfTypeExtractorpublic String[] extractAllTextInternal()
public String extractPageText(int pageNumber)
Extracts text from the page
extractPageText in interface IDocumentPageTextExtractorextractPageText in interface IPdfTypeExtractorpageNumber - 1-based number of the pagepublic int getPageCount()
getPageCount in interface IDocumentPageTextExtractorgetPageCount in interface IPdfTypeExtractorpublic void dispose()
dispose in interface com.aspose.ms.System.IDisposabledispose in interface IPdfTypeExtractorpublic String getVersion()
getVersion in interface IPdfTypeExtractorpublic boolean isFastExtractionUsed()
isFastExtractionUsed in interface IPdfTypeExtractorCopyright © 2016 Aspose. All Rights Reserved.