Package org.apache.pdfbox.tools
Class ExtractText
java.lang.Object
org.apache.pdfbox.tools.ExtractText
This is the main program that simply parses the pdf document and transforms it
into text.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate booleanprivate booleanprivate booleanprivate booleanprivate Stringprivate intprivate booleanprivate Fileprivate static final org.apache.commons.logging.Logprivate Fileprivate Stringprivate booleanprivate booleanprivate intprivate static final Stringprivate final PrintStreamprivate final PrintStreamprivate booleanprivate booleanprivate boolean -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncall()Starts the text extraction.private Writerprivate voidextractPages(int startPage, int endPage, PDFTextStripper stripper, PDDocument document, Writer output, boolean rotationMagic, boolean alwaysNext) (package private) static intgetAngle(TextPosition text) static voidInfamous main method.private longstartProcessing(String message) private voidstopProcessing(String message, long startTime)
-
Field Details
-
LOG
private static final org.apache.commons.logging.Log LOG -
STD_ENCODING
- See Also:
-
SYSOUT
-
SYSERR
-
alwaysNext
private boolean alwaysNext -
toConsole
private boolean toConsole -
debug
private boolean debug -
encoding
-
endPage
private int endPage -
toHTML
private boolean toHTML -
toMD
private boolean toMD -
ignoreBeads
private boolean ignoreBeads -
password
-
rotationMagic
private boolean rotationMagic -
sort
private boolean sort -
startPage
private int startPage -
infile
-
outfile
-
addFileName
private boolean addFileName -
append
private boolean append
-
-
Constructor Details
-
ExtractText
public ExtractText()Constructor.
-
-
Method Details
-
main
Infamous main method.- Parameters:
args- Command line arguments, should be one and a reference to a file.
-
call
Starts the text extraction. -
createOutputWriter
- Throws:
IOException
-
extractPages
private void extractPages(int startPage, int endPage, PDFTextStripper stripper, PDDocument document, Writer output, boolean rotationMagic, boolean alwaysNext) throws IOException - Throws:
IOException
-
startProcessing
-
stopProcessing
-
getAngle
-