Package org.openpdf.renderer
Class PDFParser
java.lang.Object
org.openpdf.renderer.BaseWatchable
org.openpdf.renderer.PDFParser
PDFParser is the class that parses a PDF content stream and
produces PDFCmds for a PDFPage. You should never ever see it run:
it gets created by a PDFPage only if needed, and may even run in
its own thread.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) classA class to store state needed whiel rendering.(package private) static classa token from a PDF StreamNested classes/interfaces inherited from class org.openpdf.renderer.BaseWatchable
BaseWatchable.Gate -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate booleanprivate booleanprivate booleanprivate intprivate PDFPagethe actual command, for use within a singe iteration.(package private) booleanprivate booleanprivate intprivate intprivate intprivate final WeakReference<PDFPage>a weak reference to the page we render into.private Stack<PDFParser.ParserState>private GeneralPathprivate booleanprivate PDFParser.ParserState(package private) byte[]private booleanprivate intprivate PDFParser.TokFields inherited from interface org.openpdf.renderer.Watchable
COMPLETED, ERROR, NEEDS_DATA, NOT_STARTED, PAUSED, RUNNING, STOPPED, UNKNOWN -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcleanup()Cleanup when iteration is doneprivate voidInject a stream of PDF commands onto the page.private voidParse image data into a Java BufferedImage and add the image command to the page.private PDFPaintdoPattern(PatternSpace patternSpace) Set the values into a PatternSpaceprivate voidbuild a shader from a dictionary.private voidInsert a PDF object into the command stream.private PDFObjectfindResource(String name, String inDict) get a property from a named dictionary in the resources of this content stream.private PDFFontgetFontFrom(String fontref) get a PDFFont from the resources, given the resource name of the font.intiterate()parse the stream.private PDFParser.Tokget the next token.private voidprivate PDFColorSpaceparseColorSpace(PDFObject csobj) generate a PDFColorSpace description based on a PDFObject.private voidParse an inline image.private ObjectParse the next object out of the PDF stream.private Object[]popArray()pop an array off the stackprivate floatpopFloat()pop a single float value off the stack.private float[]popFloat(int count) pop an array of float values off the stack.private float[]pop an array of integer values off the stack.private intpopInt()pop a single integer value off the stack.private PDFObjectpop a PDFObject off the stack.private Stringpop a String off the stack.private voidabstracted command processing for BT command.private voidabstracted command processing for Q command.private Stringread a byte array from the stream.private StringreadName()read a name (sequence of non-PDF-delimiting characters) from the stream.private doublereadNum()read a floating point number from the streamprivate Stringread a String from the stream.private voidsetGSState(String name) add graphics state commands contained within a dictionary.protected voidsetStatus(int status) Set the status of this watchablevoidsetup()Called to prepare for some iterationsprivate voidTry to close a path but don't fail with exception if this is not working.Methods inherited from class org.openpdf.renderer.BaseWatchable
execute, getErrorHandler, getException, getStatus, go, go, go, go, isExecutable, isFinished, isSuppressSetErrorStackTrace, run, setError, setErrorHandler, setSuppressSetErrorStackTrace, stop, waitForFinish
-
Field Details
-
mDebugCommandIndex
private int mDebugCommandIndex -
stack
-
parserStates
-
state
-
path
-
clip
private int clip -
loc
private int loc -
resend
private boolean resend -
tok
-
catchexceptions
private boolean catchexceptions -
pageRef
a weak reference to the page we render into. For the page to remain available, some other code must retain a strong reference to it. -
cmds
the actual command, for use within a singe iteration. Note that this must be released at the end of each iteration to assure the page can be collected if not in use -
stream
byte[] stream -
resources
-
errorwritten
boolean errorwritten -
autoAdjustStroke
private boolean autoAdjustStroke -
strokeOverprint
private boolean strokeOverprint -
strokeOverprintMode
private int strokeOverprintMode -
fillOverprint
private boolean fillOverprint -
fillOverprintMode
private int fillOverprintMode -
addAnnotation
private boolean addAnnotation
-
-
Constructor Details
-
PDFParser
Don't call this constructor directly. Instead, use PDFFile.getPage(int pagenum) to get a PDFPage. There should never be any reason for a user to create, access, or hold on to a PDFParser.
-
-
Method Details
-
nextToken
get the next token. -
readName
read a name (sequence of non-PDF-delimiting characters) from the stream. -
readNum
private double readNum()read a floating point number from the stream -
readString
read a String from the stream. Strings begin with a '(' character, which has already been read, and end with a balanced ')' character. A '\' character starts an escape sequence of up to three octal digits.
Parenthesis must be enclosed by a balanced set of parenthesis, so a string may enclose balanced parenthesis.
- Returns:
- the string with escape sequences replaced with their values
-
readByteArray
read a byte array from the stream. Byte arrays begin with a '<' character, which has already been read, and end with a '>' character. Each byte in the array is made up of two hex characters, the first being the high-order bit. We translate the byte arrays into char arrays by combining two bytes into a character, and then translate the character array into a string. [JK FIXME this is probably a really bad idea!]- Returns:
- the byte array
-
setup
public void setup()Called to prepare for some iterations- Overrides:
setupin classBaseWatchable
-
iterate
parse the stream. commands are added to the PDFPage initialized in the constructor as they are encountered.Page numbers in comments refer to the Adobe PDF specification.
commands are listed in PDF spec 32000-1:2008 in Table A.1- Specified by:
iteratein classBaseWatchable- Returns:
- Watchable.RUNNING when there are commands to be processed
- Watchable.COMPLETED when the page is done and all the commands have been processed
- Watchable.STOPPED if the page we are rendering into is no longer available
- Throws:
Exception
-
tryClosingPath
private void tryClosingPath()Try to close a path but don't fail with exception if this is not working. This is just a workaround for some PDFs with wrong content... -
onNextObject
- Throws:
PDFDebugger.DebugStopException
-
processQCmd
private void processQCmd()abstracted command processing for Q command. Used directly and as part of processing of mushed QBT command. -
processBTCmd
private void processBTCmd()abstracted command processing for BT command. Used directly and as part of processing of mushed QBT command. -
cleanup
public void cleanup()Cleanup when iteration is done- Overrides:
cleanupin classBaseWatchable
-
findResource
get a property from a named dictionary in the resources of this content stream.- Parameters:
name- the name of the property in the dictionaryinDict- the name of the dictionary in the resources- Returns:
- the value of the property in the dictionary
- Throws:
IOException
-
doXObject
Insert a PDF object into the command stream. The object must either be an Image or a Form, which is a set of PDF commands in a stream.- Parameters:
obj- the object to insert, an Image or a Form.- Throws:
IOException
-
doImage
Parse image data into a Java BufferedImage and add the image command to the page.- Parameters:
obj- contains the image data, and a dictionary describing the width, height and color space of the image.- Throws:
IOException
-
doForm
Inject a stream of PDF commands onto the page. Optimized to cache a parsed stream of commands, so that each Form object only needs to be parsed once.- Parameters:
obj- a stream containing the PDF commands, a transformation matrix, bounding box, and resources.- Throws:
IOException
-
doPattern
Set the values into a PatternSpace- Throws:
IOException
-
parseObject
Parse the next object out of the PDF stream. This could be a Double, a String, a HashMap (dictionary), Object[] array, or a Tok containing a PDF command. -
parseInlineImage
Parse an inline image. An inline image starts with BI (already read, contains a dictionary until ID, and then image data until EI. -
doShader
build a shader from a dictionary.- Throws:
IOException
-
getFontFrom
get a PDFFont from the resources, given the resource name of the font.- Parameters:
fontref- the resource key for the font- Throws:
IOException
-
setGSState
add graphics state commands contained within a dictionary.- Parameters:
name- the resource name of the graphics state dictionary- Throws:
IOException
-
parseColorSpace
generate a PDFColorSpace description based on a PDFObject. The object could be a standard name, or the name of a resource in the ColorSpace dictionary, or a color space name with a defining dictionary or stream.- Throws:
IOException
-
popFloat
pop a single float value off the stack.- Returns:
- the float value of the top of the stack
- Throws:
PDFParseException- if the value on the top of the stack isn't a number
-
popFloat
pop an array of float values off the stack. This is equivalent to filling an array from end to front by popping values off the stack.- Parameters:
count- the number of numbers to pop off the stack- Returns:
- an array of length count
- Throws:
PDFParseException- if any of the values popped off the stack are not numbers.
-
popInt
pop a single integer value off the stack.- Returns:
- the integer value of the top of the stack
- Throws:
PDFParseException- if the top of the stack isn't a number.
-
popFloatArray
pop an array of integer values off the stack. This is equivalent to filling an array from end to front by popping values off the stack.- Parameters:
count- the number of numbers to pop off the stack- Returns:
- an array of length count
- Throws:
PDFParseException- if any of the values popped off the stack are not numbers.
-
popString
pop a String off the stack.- Returns:
- the String from the top of the stack
- Throws:
PDFParseException- if the top of the stack is not a NAME or STR.
-
popObject
pop a PDFObject off the stack.- Returns:
- the PDFObject from the top of the stack
- Throws:
PDFParseException- if the top of the stack does not contain a PDFObject.
-
popArray
pop an array off the stack- Returns:
- the array of objects that is the top element of the stack
- Throws:
PDFParseException- if the top element of the stack does not contain an array.
-
setStatus
protected void setStatus(int status) Description copied from class:BaseWatchableSet the status of this watchable- Overrides:
setStatusin classBaseWatchable
-