Package com.sun.pdfview
Class PDFParser
- java.lang.Object
-
- com.sun.pdfview.BaseWatchable
-
- com.sun.pdfview.PDFParser
-
- All Implemented Interfaces:
Watchable,java.lang.Runnable
public class PDFParser extends BaseWatchable
PDFParser is the class that parses a PDF content stream and produces PDFCmds for a PDFPage. You should never ever see it run: it gets created by a PDFPage only if needed, and may even run in its own thread.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) classPDFParser.ParserStateA class to store state needed whiel rendering.(package private) classPDFParser.Toka token from a PDF Stream-
Nested classes/interfaces inherited from class com.sun.pdfview.BaseWatchable
BaseWatchable.Gate
-
-
Field Summary
Fields Modifier and Type Field Description private booleancatchexceptionsprivate intclipprivate PDFPagecmdsthe actual command, for use within a singe iteration.static java.lang.StringDEBUG_DCTDECODE_DATAemit a file of DCT stream data.static intdebuglevel(package private) booleanerrorwrittenprivate intlocprivate java.lang.ref.WeakReferencepageRefa weak reference to the page we render into.private java.util.Stack<PDFParser.ParserState>parserStatesprivate java.awt.geom.GeneralPathpathprivate booleanresend(package private) java.util.HashMap<java.lang.String,PDFObject>resourcesprivate java.util.Stack<java.lang.Object>stackprivate PDFParser.ParserStatestate(package private) byte[]streamprivate PDFParser.Toktok-
Fields inherited from interface com.sun.pdfview.Watchable
COMPLETED, ERROR, NEEDS_DATA, NOT_STARTED, PAUSED, RUNNING, STOPPED, UNKNOWN
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcleanup()Cleanup when iteration is donestatic voiddebug(java.lang.String msg, int level)private voiddoForm(PDFObject obj)Inject a stream of PDF commands onto the page.private voiddoImage(PDFObject obj)Parse image data into a Java BufferedImage and add the image command to the page.private PDFPaintdoPattern(PatternSpace patternSpace)Set the values into a PatternSpaceprivate voiddoShader(PDFObject shaderObj)build a shader from a dictionary.private voiddoXObject(PDFObject obj)Insert a PDF object into the command stream.java.lang.StringdumpStream()voiddumpStreamToError()static voidemitDataFile(byte[] ary, java.lang.String name)take a byte array and write a temporary file with it's data.static java.lang.Stringescape(java.lang.String msg)private PDFObjectfindResource(java.lang.String name, java.lang.String inDict)get a property from a named dictionary in the resources of this content stream.private PDFFontgetFontFrom(java.lang.String fontref)get a PDFFont from the resources, given the resource name of the font.intiterate()parse the stream.private PDFParser.ToknextToken()get the next token.private PDFColorSpaceparseColorSpace(PDFObject csobj)generate a PDFColorSpace description based on a PDFObject.private voidparseInlineImage()Parse an inline image.private java.lang.ObjectparseObject()Parse the next object out of the PDF stream.private java.lang.Object[]popArray()pop an array off the stackprivate floatpopFloat()pop a single float value off the stack.private float[]popFloat(int count)pop an array of float values off the stack.private float[]popFloatArray()pop an array of integer values off the stack.private intpopInt()pop a single integer value off the stack.private PDFObjectpopObject()pop a PDFObject off the stack.private java.lang.StringpopString()pop a String off the stack.private voidprocessBTCmd()abstracted command processing for BT command.private voidprocessQCmd()abstracted command processing for Q command.private java.lang.StringreadByteArray()read a byte array from the stream.private java.lang.StringreadName()read a name (sequence of non-PDF-delimiting characters) from the stream.private doublereadNum()read a floating point number from the streamprivate java.lang.StringreadString()read a String from the stream.static voidsetDebugLevel(int level)private voidsetGSState(java.lang.String name)add graphics state commands contained within a dictionary.voidsetup()Called to prepare for some iterationsprivate voidthrowback()put the current token back so that it is returned again by nextToken().-
Methods inherited from class com.sun.pdfview.BaseWatchable
execute, getStatus, go, go, go, go, isExecutable, isFinished, isSuppressSetErrorStackTrace, run, setError, setStatus, setSuppressSetErrorStackTrace, stop, waitForFinish
-
-
-
-
Field Detail
-
DEBUG_DCTDECODE_DATA
public static final java.lang.String DEBUG_DCTDECODE_DATA
emit a file of DCT stream data.- See Also:
- Constant Field Values
-
stack
private java.util.Stack<java.lang.Object> stack
-
parserStates
private java.util.Stack<PDFParser.ParserState> parserStates
-
state
private PDFParser.ParserState state
-
path
private java.awt.geom.GeneralPath path
-
clip
private int clip
-
loc
private int loc
-
resend
private boolean resend
-
tok
private PDFParser.Tok tok
-
catchexceptions
private boolean catchexceptions
-
pageRef
private java.lang.ref.WeakReference pageRef
a weak reference to the page we render into. For the page to remain available, some other code must retain a strong reference to it.
-
cmds
private PDFPage cmds
the actual command, for use within a singe iteration. Note that this must be released at the end of each iteration to assure the page can be collected if not in use
-
stream
byte[] stream
-
resources
java.util.HashMap<java.lang.String,PDFObject> resources
-
debuglevel
public static int debuglevel
-
errorwritten
boolean errorwritten
-
-
Constructor Detail
-
PDFParser
public PDFParser(PDFPage cmds, byte[] stream, java.util.HashMap<java.lang.String,PDFObject> resources)
Don't call this constructor directly. Instead, use PDFFile.getPage(int pagenum) to get a PDFPage. There should never be any reason for a user to create, access, or hold on to a PDFParser.
-
-
Method Detail
-
debug
public static void debug(java.lang.String msg, int level)
-
escape
public static java.lang.String escape(java.lang.String msg)
-
setDebugLevel
public static void setDebugLevel(int level)
-
throwback
private void throwback()
put the current token back so that it is returned again by nextToken().
-
nextToken
private PDFParser.Tok nextToken()
get the next token. TODO: this creates a new token each time. Is this strictly necessary?
-
readName
private java.lang.String readName()
read a name (sequence of non-PDF-delimiting characters) from the stream.
-
readNum
private double readNum()
read a floating point number from the stream
-
readString
private java.lang.String readString()
read a String from the stream. Strings begin with a '(' character, which has already been read, and end with a balanced ')' character. A '\' character starts an escape sequence of up to three octal digits.
Parenthesis must be enclosed by a balanced set of parenthesis, so a string may enclose balanced parenthesis.
- Returns:
- the string with escape sequences replaced with their values
-
readByteArray
private java.lang.String readByteArray()
read a byte array from the stream. Byte arrays begin with a '<' character, which has already been read, and end with a '>' character. Each byte in the array is made up of two hex characters, the first being the high-order bit. We translate the byte arrays into char arrays by combining two bytes into a character, and then translate the character array into a string. [JK FIXME this is probably a really bad idea!]- Returns:
- the byte array
-
setup
public void setup()
Called to prepare for some iterations- Overrides:
setupin classBaseWatchable
-
iterate
public int iterate() throws java.lang.Exceptionparse the stream. commands are added to the PDFPage initialized in the constructor as they are encountered.Page numbers in comments refer to the Adobe PDF specification.
commands are listed in PDF spec 32000-1:2008 in Table A.1- Specified by:
iteratein classBaseWatchable- Returns:
- Watchable.RUNNING when there are commands to be processed
- Watchable.COMPLETED when the page is done and all the commands have been processed
- Watchable.STOPPED if the page we are rendering into is no longer available
- Throws:
java.lang.Exception
-
processQCmd
private void processQCmd()
abstracted command processing for Q command. Used directly and as part of processing of mushed QBT command.
-
processBTCmd
private void processBTCmd()
abstracted command processing for BT command. Used directly and as part of processing of mushed QBT command.
-
cleanup
public void cleanup()
Cleanup when iteration is done- Overrides:
cleanupin classBaseWatchable
-
dumpStreamToError
public void dumpStreamToError()
-
dumpStream
public java.lang.String dumpStream()
-
emitDataFile
public static void emitDataFile(byte[] ary, java.lang.String name)take a byte array and write a temporary file with it's data. This is intended to capture data for analysis, like after decoders.- Parameters:
ary-name-
-
findResource
private PDFObject findResource(java.lang.String name, java.lang.String inDict) throws java.io.IOException
get a property from a named dictionary in the resources of this content stream.- Parameters:
name- the name of the property in the dictionaryinDict- the name of the dictionary in the resources- Returns:
- the value of the property in the dictionary
- Throws:
java.io.IOException
-
doXObject
private void doXObject(PDFObject obj) throws java.io.IOException
Insert a PDF object into the command stream. The object must either be an Image or a Form, which is a set of PDF commands in a stream.- Parameters:
obj- the object to insert, an Image or a Form.- Throws:
java.io.IOException
-
doImage
private void doImage(PDFObject obj) throws java.io.IOException
Parse image data into a Java BufferedImage and add the image command to the page.- Parameters:
obj- contains the image data, and a dictionary describing the width, height and color space of the image.- Throws:
java.io.IOException
-
doForm
private void doForm(PDFObject obj) throws java.io.IOException
Inject a stream of PDF commands onto the page. Optimized to cache a parsed stream of commands, so that each Form object only needs to be parsed once.- Parameters:
obj- a stream containing the PDF commands, a transformation matrix, bounding box, and resources.- Throws:
java.io.IOException
-
doPattern
private PDFPaint doPattern(PatternSpace patternSpace) throws java.io.IOException
Set the values into a PatternSpace- Throws:
java.io.IOException
-
parseObject
private java.lang.Object parseObject() throws PDFParseExceptionParse the next object out of the PDF stream. This could be a Double, a String, a HashMap (dictionary), Object[] array, or a Tok containing a PDF command.- Throws:
PDFParseException
-
parseInlineImage
private void parseInlineImage() throws java.io.IOExceptionParse an inline image. An inline image starts with BI (already read, contains a dictionary until ID, and then image data until EI.- Throws:
java.io.IOException
-
doShader
private void doShader(PDFObject shaderObj) throws java.io.IOException
build a shader from a dictionary.- Throws:
java.io.IOException
-
getFontFrom
private PDFFont getFontFrom(java.lang.String fontref) throws java.io.IOException
get a PDFFont from the resources, given the resource name of the font.- Parameters:
fontref- the resource key for the font- Throws:
java.io.IOException
-
setGSState
private void setGSState(java.lang.String name) throws java.io.IOExceptionadd graphics state commands contained within a dictionary.- Parameters:
name- the resource name of the graphics state dictionary- Throws:
java.io.IOException
-
parseColorSpace
private PDFColorSpace parseColorSpace(PDFObject csobj) throws java.io.IOException
generate a PDFColorSpace description based on a PDFObject. The object could be a standard name, or the name of a resource in the ColorSpace dictionary, or a color space name with a defining dictionary or stream.- Throws:
java.io.IOException
-
popFloat
private float popFloat() throws PDFParseExceptionpop a single float value off the stack.- Returns:
- the float value of the top of the stack
- Throws:
PDFParseException- if the value on the top of the stack isn't a number
-
popFloat
private float[] popFloat(int count) throws PDFParseExceptionpop an array of float values off the stack. This is equivalent to filling an array from end to front by popping values off the stack.- Parameters:
count- the number of numbers to pop off the stack- Returns:
- an array of length count
- Throws:
PDFParseException- if any of the values popped off the stack are not numbers.
-
popInt
private int popInt() throws PDFParseExceptionpop a single integer value off the stack.- Returns:
- the integer value of the top of the stack
- Throws:
PDFParseException- if the top of the stack isn't a number.
-
popFloatArray
private float[] popFloatArray() throws PDFParseExceptionpop an array of integer values off the stack. This is equivalent to filling an array from end to front by popping values off the stack.- Parameters:
count- the number of numbers to pop off the stack- Returns:
- an array of length count
- Throws:
PDFParseException- if any of the values popped off the stack are not numbers.
-
popString
private java.lang.String popString() throws PDFParseExceptionpop a String off the stack.- Returns:
- the String from the top of the stack
- Throws:
PDFParseException- if the top of the stack is not a NAME or STR.
-
popObject
private PDFObject popObject() throws PDFParseException
pop a PDFObject off the stack.- Returns:
- the PDFObject from the top of the stack
- Throws:
PDFParseException- if the top of the stack does not contain a PDFObject.
-
popArray
private java.lang.Object[] popArray() throws PDFParseExceptionpop an array off the stack- Returns:
- the array of objects that is the top element of the stack
- Throws:
PDFParseException- if the top element of the stack does not contain an array.
-
-