Package com.itextpdf.text.pdf.mc
Class MCParser
- java.lang.Object
-
- com.itextpdf.text.pdf.mc.MCParser
-
public class MCParser extends java.lang.ObjectThis class will parse page content streams and add Do operators in a marked-content sequence for every field that needs to be flattened.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classMCParser.BeginMarkedContentDictionaryOperatorClass that knows how to process marked content operators.private static classMCParser.BeginTextOperatorClass that knows how to process the BT operator.private static classMCParser.CopyContentOperatorClass that processes content by just printing the operator and its operands.private static classMCParser.DoOperatorClass that knows how to process Do operators.private static classMCParser.EndTextOperatorClass that knows how to the ET operators.static interfaceMCParser.PdfOperatorPDF Operator interface.private static classMCParser.TextNewLineOperatorClass that knows how to the text state operators that result in a newline.private static classMCParser.TextPositioningOperatorClass that knows how to the ET operators.private static classMCParser.TextStateOperatorClass that knows how to the text state operators.
-
Field Summary
Fields Modifier and Type Field Description protected PdfArrayannotsthe annotations of the page that is being processed.protected java.io.ByteArrayOutputStreambaosThe contents of the new content stream of the page.protected booleanbtWriteDid we postpone writing a BT operator?static java.lang.StringDEFAULTOPERATORConstant used for the default operator.protected booleanetExtraDid we postpone writing a BT operator?protected booleaninTextAre we inside a BT/ET sequence?protected StructureItemsitemsThe list with structure items.protected static LoggerLOGGERThe Logger instanceprotected java.util.Map<java.lang.String,MCParser.PdfOperator>operatorsA map with all supported operators operators (PDF syntax).protected PdfDictionarypageThe page dictionaryprotected PdfIndirectReferencepagerefThe reference to the page dictionaryprotected static RandomAccessSourceFactoryRASFACTORYFactory that will help us build a RandomAccessSource.protected PdfNumberstructParentsthe StructParents of the page that is being processed.protected java.lang.StringBuffertextA buffer containing text state.static PdfLiteralTSTARA new line operatorprotected PdfDictionaryxobjectsthe XObject dictionary of the page that is being processed.
-
Constructor Summary
Constructors Constructor Description MCParser(StructureItems items)Creates an MCParser object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidcheckBT()Checks if a BT operator is waiting to be added.protected voidconvertToXObject(StructureObject item)Converts an annotation structure item to a Form XObject annotation.protected voiddealWithMcid(PdfNumber mcid)When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.protected voiddealWithXObj(PdfName xobj)When an XObject with a StructParent is encountered, we want to remove it from the stack.voidparse(PdfDictionary page, PdfIndirectReference pageref)Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.protected voidpopulateOperators()Populates the operators variable.protected voidprintln(PdfObject o)Writes a PDF object to the OutputStream, followed by a newline character.protected voidprintOperator(PdfLiteral operator, java.util.List<PdfObject> operands)Adds an operator and its operands (if any) to baos.protected voidprintsp(PdfObject o)Writes a PDF object to the OutputStream, followed by a space character.protected voidprintTextOperator(PdfLiteral operator, java.util.List<PdfObject> operands)Adds an operator and its operands (if any) to baos, keeping track of the text state.protected voidprocessOperator(PdfLiteral operator, java.util.List<PdfObject> operands)Processes an operator, for instance: write the operator and its operands to baos.protected voidsetInText(boolean inText)Informs the parser that we're inside or outside a text object.
-
-
-
Field Detail
-
LOGGER
protected static final Logger LOGGER
The Logger instance
-
RASFACTORY
protected static final RandomAccessSourceFactory RASFACTORY
Factory that will help us build a RandomAccessSource.
-
DEFAULTOPERATOR
public static final java.lang.String DEFAULTOPERATOR
Constant used for the default operator.- See Also:
- Constant Field Values
-
TSTAR
public static final PdfLiteral TSTAR
A new line operator
-
operators
protected java.util.Map<java.lang.String,MCParser.PdfOperator> operators
A map with all supported operators operators (PDF syntax).
-
items
protected StructureItems items
The list with structure items.
-
baos
protected java.io.ByteArrayOutputStream baos
The contents of the new content stream of the page.
-
page
protected PdfDictionary page
The page dictionary
-
pageref
protected PdfIndirectReference pageref
The reference to the page dictionary
-
annots
protected PdfArray annots
the annotations of the page that is being processed.
-
structParents
protected PdfNumber structParents
the StructParents of the page that is being processed.
-
xobjects
protected PdfDictionary xobjects
the XObject dictionary of the page that is being processed.
-
btWrite
protected boolean btWrite
Did we postpone writing a BT operator?
-
etExtra
protected boolean etExtra
Did we postpone writing a BT operator?
-
inText
protected boolean inText
Are we inside a BT/ET sequence?
-
text
protected java.lang.StringBuffer text
A buffer containing text state.
-
-
Constructor Detail
-
MCParser
public MCParser(StructureItems items)
Creates an MCParser object.- Parameters:
items- a list of StructureItem objects
-
-
Method Detail
-
populateOperators
protected void populateOperators()
Populates the operators variable.
-
parse
public void parse(PdfDictionary page, PdfIndirectReference pageref) throws java.io.IOException, DocumentException
Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.- Parameters:
page- a page dictionarypageref- the reference to the page dictionaryfinalPage- indicates whether the page being processed is the final page in the document- Throws:
java.io.IOExceptionDocumentException
-
dealWithXObj
protected void dealWithXObj(PdfName xobj)
When an XObject with a StructParent is encountered, we want to remove it from the stack.- Parameters:
xobj- the name of an XObject
-
dealWithMcid
protected void dealWithMcid(PdfNumber mcid) throws java.io.IOException, DocumentException
When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.- Parameters:
mcid- the MCID that was encountered in the content stream- Throws:
java.io.IOExceptionDocumentException
-
convertToXObject
protected void convertToXObject(StructureObject item) throws java.io.IOException, DocumentException
Converts an annotation structure item to a Form XObject annotation.- Parameters:
item- the structure item- Throws:
java.io.IOExceptionDocumentException
-
processOperator
protected void processOperator(PdfLiteral operator, java.util.List<PdfObject> operands) throws java.io.IOException, DocumentException
Processes an operator, for instance: write the operator and its operands to baos.- Parameters:
operator- the operatoroperands- the operator's operands- Throws:
java.io.IOExceptionDocumentException
-
printOperator
protected void printOperator(PdfLiteral operator, java.util.List<PdfObject> operands) throws java.io.IOException
Adds an operator and its operands (if any) to baos.- Parameters:
operator- the operatoroperands- its operands- Throws:
java.io.IOException
-
printTextOperator
protected void printTextOperator(PdfLiteral operator, java.util.List<PdfObject> operands) throws java.io.IOException
Adds an operator and its operands (if any) to baos, keeping track of the text state.- Parameters:
operator- the operatoroperands- its operands- Throws:
java.io.IOException
-
printsp
protected void printsp(PdfObject o) throws java.io.IOException
Writes a PDF object to the OutputStream, followed by a space character.- Parameters:
o- a PdfObject- Throws:
java.io.IOException
-
println
protected void println(PdfObject o) throws java.io.IOException
Writes a PDF object to the OutputStream, followed by a newline character.- Parameters:
o- a PdfObject- Throws:
java.io.IOException
-
checkBT
protected void checkBT() throws java.io.IOExceptionChecks if a BT operator is waiting to be added.- Throws:
java.io.IOException
-
setInText
protected void setInText(boolean inText)
Informs the parser that we're inside or outside a text object. Also sets a parameter indicating that BT needs to be written.- Parameters:
inText- true if we're inside.
-
-