Package com.itextpdf.io.source
Class PdfTokenizer
- java.lang.Object
-
- com.itextpdf.io.source.PdfTokenizer
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable
public class PdfTokenizer extends java.lang.Object implements java.io.Closeable
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classPdfTokenizer.TokenType
-
Field Summary
Fields Modifier and Type Field Description private booleancloseStreamStreams are closed automatically.private static boolean[]delimsstatic byte[]Fstatic byte[]Falseprivate RandomAccessFileOrArrayfileprotected intgenerationprotected booleanhexStringstatic byte[]Nstatic byte[]Nullstatic byte[]Objprotected ByteBufferoutBufstatic byte[]Rprotected intreferencestatic byte[]Startxrefstatic byte[]Streamstatic byte[]Trailerstatic byte[]Trueprotected PdfTokenizer.TokenTypetypestatic byte[]Xref
-
Constructor Summary
Constructors Constructor Description PdfTokenizer(RandomAccessFileOrArray file)Creates a PdfTokenizer for the specifiedRandomAccessFileOrArray.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbackOnePosition(int ch)voidcheckFdfHeader()static int[]checkObjectStart(PdfTokenizer lineTokenizer)Check whether line starts with object declaration.java.lang.StringcheckPdfHeader()static booleancheckTrailer(ByteBuffer line)Checks whetherlineequals to 'trailer'.voidclose()static byte[]decodeStringContent(byte[] content, boolean hexWriting)Resolve escape symbols or hexadecimal symbols.protected static byte[]decodeStringContent(byte[] content, int from, int to, boolean hexWriting)Resolve escape symbols or hexadecimal symbols.byte[]getByteContent()byte[]getDecodedStringContent()intgetGenNr()intgetHeaderOffset()intgetIntValue()longgetLongValue()longgetNextEof()Gets next %%EOF marker in current PDF file.intgetObjNr()longgetPosition()RandomAccessFileOrArraygetSafeFile()longgetStartxref()java.lang.StringgetStringValue()PdfTokenizer.TokenTypegetTokenType()booleanisCloseStream()protected static booleanisDelimiter(int ch)protected static booleanisDelimiterWhitespace(int ch)booleanisHexString()static booleanisWhitespace(int ch)Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.protected static booleanisWhitespace(int ch, boolean isWhitespace)Checks whether a character is a whitespace.longlength()booleannextToken()voidnextValidToken()intpeek()Gets the next byte of pdf source without moving source position.intpeek(byte[] buffer)Gets the nextbuffer.lengthbytes of pdf source without moving source position.intread()voidreadFully(byte[] bytes)booleanreadLineSegment(ByteBuffer buffer)Reads data into the provided byte[].booleanreadLineSegment(ByteBuffer buffer, boolean isNullWhitespace)Reads data into the provided byte[].java.lang.StringreadString(int size)voidseek(long pos)voidsetCloseStream(boolean closeStream)voidthrowError(java.lang.String error, java.lang.Object... messageParams)Helper method to handle content errors.booleantokenValueEqualsTo(byte[] cmp)
-
-
-
Field Detail
-
Obj
public static final byte[] Obj
-
R
public static final byte[] R
-
Xref
public static final byte[] Xref
-
Startxref
public static final byte[] Startxref
-
Stream
public static final byte[] Stream
-
Trailer
public static final byte[] Trailer
-
N
public static final byte[] N
-
F
public static final byte[] F
-
Null
public static final byte[] Null
-
True
public static final byte[] True
-
False
public static final byte[] False
-
type
protected PdfTokenizer.TokenType type
-
reference
protected int reference
-
generation
protected int generation
-
hexString
protected boolean hexString
-
outBuf
protected ByteBuffer outBuf
-
file
private final RandomAccessFileOrArray file
-
closeStream
private boolean closeStream
Streams are closed automatically.
-
delims
private static final boolean[] delims
-
-
Constructor Detail
-
PdfTokenizer
public PdfTokenizer(RandomAccessFileOrArray file)
Creates a PdfTokenizer for the specifiedRandomAccessFileOrArray. The beginning of the file is read to determine the location of the header, and the data source is adjusted as necessary to account for any junk that occurs in the byte source before the header- Parameters:
file- the source
-
-
Method Detail
-
seek
public void seek(long pos)
-
readFully
public void readFully(byte[] bytes) throws java.io.IOException- Throws:
java.io.IOException
-
getPosition
public long getPosition()
-
close
public void close() throws java.io.IOException- Specified by:
closein interfacejava.lang.AutoCloseable- Specified by:
closein interfacejava.io.Closeable- Throws:
java.io.IOException
-
length
public long length()
-
read
public int read() throws java.io.IOException- Throws:
java.io.IOException
-
peek
public int peek() throws java.io.IOExceptionGets the next byte of pdf source without moving source position.- Returns:
- the byte, or -1 if EOF is reached
- Throws:
java.io.IOException- in case of any reading error.
-
peek
public int peek(byte[] buffer) throws java.io.IOExceptionGets the nextbuffer.lengthbytes of pdf source without moving source position.- Parameters:
buffer- buffer to store read bytes- Returns:
- the number of read bytes. If it is less than
buffer.lengthit means EOF has been reached. - Throws:
java.io.IOException- in case of any reading error.
-
readString
public java.lang.String readString(int size) throws java.io.IOException- Throws:
java.io.IOException
-
getTokenType
public PdfTokenizer.TokenType getTokenType()
-
getByteContent
public byte[] getByteContent()
-
getStringValue
public java.lang.String getStringValue()
-
getDecodedStringContent
public byte[] getDecodedStringContent()
-
tokenValueEqualsTo
public boolean tokenValueEqualsTo(byte[] cmp)
-
getObjNr
public int getObjNr()
-
getGenNr
public int getGenNr()
-
backOnePosition
public void backOnePosition(int ch)
-
getHeaderOffset
public int getHeaderOffset() throws java.io.IOException- Throws:
java.io.IOException
-
checkPdfHeader
public java.lang.String checkPdfHeader() throws java.io.IOException- Throws:
java.io.IOException
-
checkFdfHeader
public void checkFdfHeader() throws java.io.IOException- Throws:
java.io.IOException
-
getStartxref
public long getStartxref() throws java.io.IOException- Throws:
java.io.IOException
-
getNextEof
public long getNextEof() throws java.io.IOExceptionGets next %%EOF marker in current PDF file.- Returns:
- next %%EOF marker position
- Throws:
java.io.IOException- in case of input-output related exceptions during PDF document reading
-
nextValidToken
public void nextValidToken() throws java.io.IOException- Throws:
java.io.IOException
-
nextToken
public boolean nextToken() throws java.io.IOException- Throws:
java.io.IOException
-
getLongValue
public long getLongValue()
-
getIntValue
public int getIntValue()
-
isHexString
public boolean isHexString()
-
isCloseStream
public boolean isCloseStream()
-
setCloseStream
public void setCloseStream(boolean closeStream)
-
getSafeFile
public RandomAccessFileOrArray getSafeFile()
-
decodeStringContent
protected static byte[] decodeStringContent(byte[] content, int from, int to, boolean hexWriting)Resolve escape symbols or hexadecimal symbols.NOTE Due to PdfReference 1.7 part 3.2.3 String value contain ASCII characters, so we can convert it directly to byte array.
- Parameters:
content- string bytes to be decodedfrom- given start indexto- given end indexhexWriting- true if given string is hex-encoded, e.g. '<69546578…>'. False otherwise, e.g. '((iText( some version)…)'- Returns:
- byte[] for decrypting or for creating
String.
-
decodeStringContent
public static byte[] decodeStringContent(byte[] content, boolean hexWriting)Resolve escape symbols or hexadecimal symbols.
NOTE Due to PdfReference 1.7 part 3.2.3 String value contain ASCII characters, so we can convert it directly to byte array.- Parameters:
content- string bytes to be decodedhexWriting- true if given string is hex-encoded, e.g. '<69546578…>'. False otherwise, e.g. '((iText( some version)…)'- Returns:
- byte[] for decrypting or for creating
String.
-
isWhitespace
public static boolean isWhitespace(int ch)
Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.
The same as callingisWhiteSpace(ch, true).- Parameters:
ch- int- Returns:
- boolean
-
isWhitespace
protected static boolean isWhitespace(int ch, boolean isWhitespace)Checks whether a character is a whitespace. Currently checks on the following: '0', '9', '10', '12', '13', '32'.- Parameters:
ch- intisWhitespace- boolean- Returns:
- boolean
-
isDelimiter
protected static boolean isDelimiter(int ch)
-
isDelimiterWhitespace
protected static boolean isDelimiterWhitespace(int ch)
-
throwError
public void throwError(java.lang.String error, java.lang.Object... messageParams)Helper method to handle content errors. Add file position toPdfRuntimeException.- Parameters:
error- message.messageParams- error params.- Throws:
IOException- wrap error message intoPdfRuntimeExceptionand add position in file.
-
checkTrailer
public static boolean checkTrailer(ByteBuffer line)
Checks whetherlineequals to 'trailer'.- Parameters:
line- for check- Returns:
- true, if line is equals to 'trailer', otherwise false
-
readLineSegment
public boolean readLineSegment(ByteBuffer buffer) throws java.io.IOException
Reads data into the provided byte[]. Checks on leading whitespace. SeeisWhiteSpace(int)orisWhiteSpace(int, boolean)for a list of whitespace characters.
The same as callingreadLineSegment(input, true).- Parameters:
buffer- aByteBufferto which the result of reading will be saved- Returns:
- true, if something was read or if the end of the input stream is not reached
- Throws:
java.io.IOException- in case of any reading error
-
readLineSegment
public boolean readLineSegment(ByteBuffer buffer, boolean isNullWhitespace) throws java.io.IOException
Reads data into the provided byte[]. Checks on leading whitespace. SeeisWhiteSpace(int)orisWhiteSpace(int, boolean)for a list of whitespace characters.- Parameters:
buffer- aByteBufferto which the result of reading will be savedisNullWhitespace- boolean to indicate whether '0' is whitespace or not. If in doubt, use true or overloaded methodreadLineSegment(input)- Returns:
- true, if something was read or if the end of the input stream is not reached
- Throws:
java.io.IOException- in case of any reading error
-
checkObjectStart
public static int[] checkObjectStart(PdfTokenizer lineTokenizer)
Check whether line starts with object declaration.- Parameters:
lineTokenizer- tokenizer, built by single line.- Returns:
- object number and generation if check is successful, otherwise - null.
-
-