Class Document
- All Implemented Interfaces:
Cloneable
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classA Document's output settings control the form of the text() and html() methods.static enum -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Stringprivate Document.OutputSettingsprivate Parserprivate Document.QuirksModeprivate static final Evaluatorprivate booleanFields inherited from class Element
childNodesFields inherited from class Node
EmptyNodes, EmptyString, parentNode, siblingIndex -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbody()Get this document's<body>or<frameset>element.charset()Returns the charset used in this document.voidSets the charset used in this document.clone()Create a stand-alone, deep copy of this node, and all of its children.createElement(String tagName) Create a new Element, with this document's base uri.static DocumentcreateShell(String baseUri) Create a valid, empty shell of a document, suitable for adding more elements to.Returns this Document's doctype.private voidEnsures a meta charset (html) or xml declaration (xml) with the current encoding used.head()Get this document'sheadelement.private ElementhtmlEl()Find the root HTML element, or create it if it doesn't exist.location()Get the URL this Document was parsed from.nodeName()Get the node name of this node.Normalise the document.private voidnormaliseStructure(String tag, Element htmlEl) private voidnormaliseTextNodes(Element element) Get the outer HTML of this node.Get the document's current output settings.outputSettings(Document.OutputSettings outputSettings) Set the document's output settings.parser()Get the parser that was used to parse this document.Set the parser used to create this document.quirksMode(Document.QuirksMode quirksMode) Set the text of thebodyof this document.title()Get the string contents of the document'stitleelement.voidSet the document'stitleelement.booleanReturns whether the element with charset information in this document is updated on changes throughDocument.charset(Charset)or not.voidupdateMetaCharsetElement(boolean update) Sets whether the element with charset information in this document is updated on changes throughDocument.charset(Charset)or not.Methods inherited from class Element
addClass, after, after, append, appendChild, appendChildren, appendElement, appendText, appendTo, attr, attr, attributes, baseUri, before, before, child, childElementsList, childNodeSize, children, childrenSize, className, classNames, classNames, clearAttributes, closest, closest, cssSelector, data, dataNodes, dataset, doClone, doSetBaseUri, elementSiblingIndex, empty, ensureChildNodes, filter, firstElementSibling, getAllElements, getElementById, getElementsByAttribute, getElementsByAttributeStarting, getElementsByAttributeValue, getElementsByAttributeValueContaining, getElementsByAttributeValueEnding, getElementsByAttributeValueMatching, getElementsByAttributeValueMatching, getElementsByAttributeValueNot, getElementsByAttributeValueStarting, getElementsByClass, getElementsByIndexEquals, getElementsByIndexGreaterThan, getElementsByIndexLessThan, getElementsByTag, getElementsContainingOwnText, getElementsContainingText, getElementsMatchingOwnText, getElementsMatchingOwnText, getElementsMatchingText, getElementsMatchingText, hasAttributes, hasChildNodes, hasClass, hasText, html, html, html, id, id, insertChild, insertChildren, insertChildren, is, is, isBlock, lastElementSibling, nextElementSibling, nextElementSiblings, nodelistChanged, normalName, outerHtmlHead, outerHtmlTail, ownText, parent, parents, prepend, prependChild, prependChildren, prependElement, prependText, preserveWhitespace, previousElementSibling, previousElementSiblings, removeAttr, removeClass, root, select, select, selectFirst, selectFirst, shallowClone, siblingElements, tag, tagName, tagName, text, textNodes, toggleClass, traverse, val, val, wholeText, wrapMethods inherited from class Node
absUrl, addChildren, addChildren, attr, childNode, childNodes, childNodesAsArray, childNodesCopy, equals, hasAttr, hasParent, hasSameValue, indent, nextSibling, outerHtml, ownerDocument, parentNode, previousSibling, remove, removeChild, reparentChild, replaceChild, replaceWith, setBaseUri, setParentNode, setSiblingIndex, siblingIndex, siblingNodes, toString, unwrap
-
Field Details
-
outputSettings
-
parser
-
quirksMode
-
location
-
updateMetaCharset
private boolean updateMetaCharset -
titleEval
-
-
Constructor Details
-
Document
Create a new, empty Document.- Parameters:
baseUri- base URI of document- See Also:
-
-
Method Details
-
createShell
-
location
Get the URL this Document was parsed from. If the starting URL is a redirect, this will return the final URL from which the document was served from.Will return an empty string if the location is unknown (e.g. if parsed from a String).
- Returns:
- location
-
documentType
Returns this Document's doctype.- Returns:
- document type, or null if not set
-
htmlEl
Find the root HTML element, or create it if it doesn't exist.- Returns:
- the root HTML element.
-
head
Get this document'sheadelement.As a side-effect, if this Document does not already have a HTML structure, it will be created. If you do not want that, use
#selectFirst("head")instead.- Returns:
headelement.
-
body
Get this document's<body>or<frameset>element.As a side-effect, if this Document does not already have a HTML structure, it will be created with a
<body>element. If you do not want that, use#selectFirst("body")instead.- Returns:
bodyelement for documents with a<body>, a new<body>element if the document had no contents, or the outermost<frameset> elementfor frameset documents.
-
title
Get the string contents of the document'stitleelement.- Returns:
- Trimmed title, or empty string if none set.
-
title
Set the document'stitleelement. Updates the existing element, or addstitletoheadif not present- Parameters:
title- string to set as title
-
createElement
-
normalise
Normalise the document. This happens after the parse phase so generally does not need to be called. Moves any text content that is not in the body element into the body.- Returns:
- this document after normalisation
-
normaliseTextNodes
-
normaliseStructure
-
outerHtml
-
text
-
nodeName
-
charset
Sets the charset used in this document. This method is equivalent toOutputSettings.charset(Charset)but in addition it updates the charset / encoding element within the document.This enables
meta charset update.If there's no element with charset / encoding information yet it will be created. Obsolete charset / encoding definitions are removed!
Elements used:
- Html: <meta charset="CHARSET">
- Xml: <?xml version="1.0" encoding="CHARSET">
- Parameters:
charset- Charset- See Also:
-
charset
Returns the charset used in this document. This method is equivalent toDocument.OutputSettings.charset().- Returns:
- Current Charset
- See Also:
-
updateMetaCharsetElement
public void updateMetaCharsetElement(boolean update) Sets whether the element with charset information in this document is updated on changes throughDocument.charset(Charset)or not.If set to false (default) there are no elements modified.
- Parameters:
update- If true the element updated on charset changes, false if not- See Also:
-
updateMetaCharsetElement
public boolean updateMetaCharsetElement()Returns whether the element with charset information in this document is updated on changes throughDocument.charset(Charset)or not.- Returns:
- Returns true if the element is updated on charset changes, false if not
-
clone
Description copied from class:NodeCreate a stand-alone, deep copy of this node, and all of its children. The cloned node will have no siblings or parent node. As a stand-alone object, any changes made to the clone or any of its children will not impact the original node.The cloned node may be adopted into another Document or node structure using
Element.appendChild(Node). -
ensureMetaCharsetElement
private void ensureMetaCharsetElement()Ensures a meta charset (html) or xml declaration (xml) with the current encoding used. This only applies withupdateMetaCharsetset to true, otherwise this method does nothing.- An existing element gets updated with the current charset
- If there's no element yet it will be inserted
- Obsolete elements are removed
- Html: <meta charset="CHARSET">
- Xml: <?xml version="1.0" encoding="CHARSET">
-
outputSettings
Get the document's current output settings.- Returns:
- the document's current output settings.
-
outputSettings
Set the document's output settings.- Parameters:
outputSettings- new output settings.- Returns:
- this document, for chaining.
-
quirksMode
-
quirksMode
-
parser
-
parser
Set the parser used to create this document. This parser is then used when further parsing within this document is required.- Parameters:
parser- the configured parser to use when further parsing is required for this document.- Returns:
- this document, for chaining.
-