Package org.apache.xmpbox.xml
Class DomXmpParser
java.lang.Object
org.apache.xmpbox.xml.DomXmpParser
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate DocumentBuilderprivate DomXmpParser.NamespaceFinderprivate boolean -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate PropertyTypecheckPropertyDefinition(TypeMapping tm, QName qName, String parentTypeName) private voidcreateProperty(XMPMetadata xmp, Element property, PropertyType type, ComplexPropertyContainer container) private voidexpectNaming(Element element, String ns, String prefix, String ln) private Elementprivate AbstractStructuredTypeinstanciateStructured(TypeMapping tm, Types type, String name, String structuredNamespace) private booleanisSchemaExtensionProperty(Element element) booleanTell if strict parsing mode is enabled.private voidloadAttributes(AbstractField sp, Element element) private voidmanageArray(XMPMetadata xmp, Element property, PropertyType type, ComplexPropertyContainer container) private voidmanageDefinedType(XMPMetadata xmp, Element property, String prefix, ComplexPropertyContainer container) private voidmanageLangAlt(XMPMetadata xmp, Element property, ComplexPropertyContainer container) private voidmanageSimpleType(XMPMetadata xmp, Element property, Types type, ComplexPropertyContainer container) private voidmanageStructuredType(XMPMetadata xmp, Element property, String prefix, ComplexPropertyContainer container) private voidmaybeAddNonStandardNamespace(XMPMetadata xmp, Attr attr) parse(byte[] xmp) parse(InputStream input) private voidparseChildrenAsProperties(XMPMetadata xmp, List<Element> properties, TypeMapping tm, Element description) private voidparseDescriptionInner(XMPMetadata xmp, Element description, ComplexPropertyContainer parentContainer) private voidparseDescriptionRoot(XMPMetadata xmp, Element description) private voidparseDescriptionRootAttr(XMPMetadata xmp, Element description, Attr attr, TypeMapping tm) private voidparseEndPacket(XMPMetadata metadata, ProcessingInstruction pi) private XMPMetadataprivate AbstractStructuredTypeparseLiDescription(XMPMetadata xmp, QName parentQName, Element liDescriptionElement) private AbstractFieldparseLiElement(XMPMetadata xmp, QName descriptor, Element liElement, Types type) private voidparseSchemaExtensions(XMPMetadata xmp, Element description) private voidremoveCommentsAndBlanks(Node root) Remove all the comments and blank nodes in the parent element of the parametervoidsetStrictParsing(boolean strictParsing) Enable or disable strict parsing mode.private AbstractStructuredTypetryParseAttributesAsProperties(TypeMapping tm, Element liElement, AbstractStructuredType ast, PropertiesDescription pm, QName qName) This attempts to run the same logic as in parseLiDescription() but with simple attributes that will be treated like children.
-
Field Details
-
dBuilder
-
nsFinder
-
strictParsing
private boolean strictParsing
-
-
Constructor Details
-
DomXmpParser
- Throws:
XmpParsingException
-
-
Method Details
-
isStrictParsing
public boolean isStrictParsing()Tell if strict parsing mode is enabled.- Returns:
- Whether strict parsing mode is enabled or not.
-
setStrictParsing
public void setStrictParsing(boolean strictParsing) Enable or disable strict parsing mode.- Parameters:
strictParsing- Whether to be strict or lenient when parsing XMP. True (the default) means that malformed XMP will result in an exception, false (lenient) means that if malformed content is encountered, the parser will continue its work if possible. Use strict mode if you want to work with PDF/A files. Use lenient mode if you care more about getting metadata.
-
parse
- Throws:
XmpParsingException
-
parse
- Throws:
XmpParsingException
-
maybeAddNonStandardNamespace
-
isSchemaExtensionProperty
-
parseSchemaExtensions
- Throws:
XmpParsingException
-
parseDescriptionRoot
- Throws:
XmpParsingException
-
parseDescriptionRootAttr
private void parseDescriptionRootAttr(XMPMetadata xmp, Element description, Attr attr, TypeMapping tm) throws XmpSchemaException, XmpParsingException -
parseChildrenAsProperties
private void parseChildrenAsProperties(XMPMetadata xmp, List<Element> properties, TypeMapping tm, Element description) throws XmpParsingException, XmpSchemaException -
createProperty
private void createProperty(XMPMetadata xmp, Element property, PropertyType type, ComplexPropertyContainer container) throws XmpParsingException - Throws:
XmpParsingException
-
manageDefinedType
private void manageDefinedType(XMPMetadata xmp, Element property, String prefix, ComplexPropertyContainer container) throws XmpParsingException - Throws:
XmpParsingException
-
manageStructuredType
private void manageStructuredType(XMPMetadata xmp, Element property, String prefix, ComplexPropertyContainer container) throws XmpParsingException - Throws:
XmpParsingException
-
manageSimpleType
private void manageSimpleType(XMPMetadata xmp, Element property, Types type, ComplexPropertyContainer container) -
manageArray
private void manageArray(XMPMetadata xmp, Element property, PropertyType type, ComplexPropertyContainer container) throws XmpParsingException - Throws:
XmpParsingException
-
manageLangAlt
private void manageLangAlt(XMPMetadata xmp, Element property, ComplexPropertyContainer container) throws XmpParsingException - Throws:
XmpParsingException
-
parseDescriptionInner
private void parseDescriptionInner(XMPMetadata xmp, Element description, ComplexPropertyContainer parentContainer) throws XmpParsingException - Throws:
XmpParsingException
-
parseLiElement
private AbstractField parseLiElement(XMPMetadata xmp, QName descriptor, Element liElement, Types type) throws XmpParsingException - Throws:
XmpParsingException
-
loadAttributes
-
parseLiDescription
private AbstractStructuredType parseLiDescription(XMPMetadata xmp, QName parentQName, Element liDescriptionElement) throws XmpParsingException - Throws:
XmpParsingException
-
parseInitialXpacket
- Throws:
XmpParsingException
-
parseEndPacket
private void parseEndPacket(XMPMetadata metadata, ProcessingInstruction pi) throws XmpParsingException - Throws:
XmpParsingException
-
findDescriptionsParent
- Throws:
XmpParsingException
-
expectNaming
private void expectNaming(Element element, String ns, String prefix, String ln) throws XmpParsingException - Throws:
XmpParsingException
-
removeCommentsAndBlanks
Remove all the comments and blank nodes in the parent element of the parameter- Parameters:
root- the first node of an element or document to clear
-
instanciateStructured
private AbstractStructuredType instanciateStructured(TypeMapping tm, Types type, String name, String structuredNamespace) throws XmpParsingException - Throws:
XmpParsingException
-
checkPropertyDefinition
private PropertyType checkPropertyDefinition(TypeMapping tm, QName qName, String parentTypeName) throws XmpParsingException - Throws:
XmpParsingException
-
tryParseAttributesAsProperties
private AbstractStructuredType tryParseAttributesAsProperties(TypeMapping tm, Element liElement, AbstractStructuredType ast, PropertiesDescription pm, QName qName) throws XmpParsingException This attempts to run the same logic as in parseLiDescription() but with simple attributes that will be treated like children. This is inspired by loadAttributes() and parseDescriptionRootAttr(). This solves the problem in PDFBOX-3882 where properties appear as attributes in places lower than the descriptor root.- Parameters:
tm-liElement-ast- An AbstractStructuredType object, can be null.pm- A PropertiesDescription object, must be set if ast is not null.qName- QName of the parent, will be used if instantiating an AbstractStructuredType object, must be set if ast is not null.- Returns:
- An AbstractStructuredType, possibly created here if it was null as parameter.
- Throws:
XmpParsingException
-