Package org.w3c.tidy
Class Node
- java.lang.Object
-
- org.w3c.tidy.Node
-
public class Node extends java.lang.ObjectUsed for elements and text nodes element name is null for text nodes start and end are offsets into lexbuf which contains the textual content of all elements in the parse tree. Parent and content allow traversal of the parse tree in any direction. attributes are represented as a linked list of AttVal nodes which hold the strings for attribute/value pairs.- Version:
- $Revision$ ($Author$)
- Author:
- Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
-
-
Field Summary
Fields Modifier and Type Field Description protected org.w3c.dom.NodeadapterDOM adapter.static shortASP_TAGnode type: asp tag.protected AttValattributesAttribute/Value linked list.static shortCDATA_TAGnode type: CDATA.protected booleanclosedtrue if closed by explicit end tag.static shortCOMMENT_TAGnode type: comment.protected NodecontentContained node.static shortDOCTYPE_TAGnode type: doctype.protected java.lang.StringelementTag name.protected intendend of span onto text array.static shortEND_TAGEnd tag.protected booleanimplicittrue if inferred.static shortJSTE_TAGnode type: jste tag.protected Nodelastlast node.protected booleanlinebreaktrue if followed by a line break.protected Nodenextnext node.protected Nodeparentparent node.static shortPHP_TAGnode type: php tag.protected Nodeprevpevious node.static shortPROC_INS_TAGnode type: .static shortROOT_NODEnode type: root.static shortSECTION_TAGnode type: section tag.protected intstartstart of span onto text array.static shortSTART_END_TAGStart of an end tag.static shortSTART_TAGStart tag.protected Dicttagtag's dictionary definition.static shortTEXT_NODEnode type: text.protected byte[]textarraythe text array.protected shorttypeTextNode, StartTag, EndTag etc.protected Dictwasold tag when it was changed.static shortXML_DECLnode type: doctype.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddAttribute(java.lang.String name, java.lang.String value)Adds an attribute to the node.voidaddClass(java.lang.String classname)Add a css class to the node.voidcheckAttributes(Lexer lexer)Default method for checking an element's attributes.booleancheckNodeIntegrity()Checks for node integrity.protected NodecloneNode(boolean deep)Clone this node.static voidcoerceNode(Lexer lexer, Node node, Dict tag)Coerce a node.voiddiscardDocType()Discard the doctype node.static NodediscardElement(Node element)Remove node from markup tree and discard it.protected static NodeescapeTag(Lexer lexer, Node element)Escapes the given tag.booleanexpectsContent()Does the node expect contents?NodefindBody(TagTable tt)Find the body node.NodefindDocType()Find the doctype element.NodefindHEAD(TagTable tt)Find the head tag.NodefindHTML(TagTable tt)Find the "html" element.NodefindTITLE(TagTable tt)static voidfixEmptyRow(Lexer lexer, Node row)If a table row is empty then insert an empty cell.This practice is consistent with browser behavior and avoids potential problems with row spanning cells.protected org.w3c.dom.NodegetAdapter()Returns a DOM Node which wrap the current tidy Node.AttValgetAttrByName(java.lang.String name)Returns an attribute with the given name in the current node.booleanhasOneChild()Does the node have one (and only one) child?static voidinsertDocType(Lexer lexer, Node element, Node doctype)The doctype has been found after other tags, and needs moving to before the html element.static booleaninsertMisc(Node element, Node node)Insert a node at the end.voidinsertNodeAfterElement(Node node)Insert node into markup tree after element.static voidinsertNodeAsParent(Node element, Node node)Insert node into markup tree in pace of element which is moved to become the child of the node.voidinsertNodeAtEnd(Node node)Insert node into markup tree.voidinsertNodeAtStart(Node node)Insert a node into markup tree.static voidinsertNodeBeforeElement(Node element, Node node)Insert node into markup tree before element.booleanisBlank(Lexer lexer)Is the node content empty or blank? Assumes node is a text node.booleanisDescendantOf(Dict tag)Is this node contained in a given tag?booleanisElement()Is the node an element?booleanisJavaScript()Used to check script node for script language.booleanisNewNode()Is this a new (user defined) node? Used to determine how attributes without values should be printed.static voidmoveBeforeTable(Node row, Node node, TagTable tt)Unexpected content in table row is moved to just before the table in accordance with Netscape and IE.voidremoveAttribute(AttVal attr)Remove an attribute from node and then free it.voidremoveNode()Extract this node and its children from a markup tree.voidrepairDuplicateAttributes(Lexer lexer)The same attribute name can't be used more than once in each element.protected voidsetType(short newType)Setter for node type.java.lang.StringtoString()static voidtrimEmptyElement(Lexer lexer, Node element)Trim an empty element.static voidtrimInitialSpace(Lexer lexer, Node element, Node text)This maps<p> hello <em> world </em>to<p> hello <em> world </em>.static voidtrimSpaces(Lexer lexer, Node element)Move initial and trailing space out.static voidtrimTrailingSpace(Lexer lexer, Node element, Node last)This maps hello world to hello world .
-
-
-
Field Detail
-
ROOT_NODE
public static final short ROOT_NODE
node type: root.- See Also:
- Constant Field Values
-
DOCTYPE_TAG
public static final short DOCTYPE_TAG
node type: doctype.- See Also:
- Constant Field Values
-
COMMENT_TAG
public static final short COMMENT_TAG
node type: comment.- See Also:
- Constant Field Values
-
PROC_INS_TAG
public static final short PROC_INS_TAG
node type: .- See Also:
- Constant Field Values
-
TEXT_NODE
public static final short TEXT_NODE
node type: text.- See Also:
- Constant Field Values
-
START_TAG
public static final short START_TAG
Start tag.- See Also:
- Constant Field Values
-
END_TAG
public static final short END_TAG
End tag.- See Also:
- Constant Field Values
-
START_END_TAG
public static final short START_END_TAG
Start of an end tag.- See Also:
- Constant Field Values
-
CDATA_TAG
public static final short CDATA_TAG
node type: CDATA.- See Also:
- Constant Field Values
-
SECTION_TAG
public static final short SECTION_TAG
node type: section tag.- See Also:
- Constant Field Values
-
ASP_TAG
public static final short ASP_TAG
node type: asp tag.- See Also:
- Constant Field Values
-
JSTE_TAG
public static final short JSTE_TAG
node type: jste tag.- See Also:
- Constant Field Values
-
PHP_TAG
public static final short PHP_TAG
node type: php tag.- See Also:
- Constant Field Values
-
XML_DECL
public static final short XML_DECL
node type: doctype.- See Also:
- Constant Field Values
-
parent
protected Node parent
parent node.
-
prev
protected Node prev
pevious node.
-
next
protected Node next
next node.
-
last
protected Node last
last node.
-
start
protected int start
start of span onto text array.
-
end
protected int end
end of span onto text array.
-
textarray
protected byte[] textarray
the text array.
-
type
protected short type
TextNode, StartTag, EndTag etc.
-
closed
protected boolean closed
true if closed by explicit end tag.
-
implicit
protected boolean implicit
true if inferred.
-
linebreak
protected boolean linebreak
true if followed by a line break.
-
was
protected Dict was
old tag when it was changed.
-
tag
protected Dict tag
tag's dictionary definition.
-
element
protected java.lang.String element
Tag name.
-
attributes
protected AttVal attributes
Attribute/Value linked list.
-
content
protected Node content
Contained node.
-
adapter
protected org.w3c.dom.Node adapter
DOM adapter.
-
-
Constructor Detail
-
Node
public Node()
Instantiates a new text node.
-
Node
public Node(short type, byte[] textarray, int start, int end)Instantiates a new node.- Parameters:
type- node type: Node.ROOT_NODE | Node.DOCTYPE_TAG | Node.COMMENT_TAG | Node.PROC_INS_TAG | Node.TEXT_NODE | Node.START_TAG | Node.END_TAG | Node.START_END_TAG | Node.CDATA_TAG | Node.SECTION_TAG | Node. ASP_TAG | Node.JSTE_TAG | Node.PHP_TAG | Node.XML_DECLtextarray- array of bytes contained in the Nodestart- start positionend- end position
-
Node
public Node(short type, byte[] textarray, int start, int end, java.lang.String element, TagTable tt)Instantiates a new node.- Parameters:
type- node type: Node.ROOT_NODE | Node.DOCTYPE_TAG | Node.COMMENT_TAG | Node.PROC_INS_TAG | Node.TEXT_NODE | Node.START_TAG | Node.END_TAG | Node.START_END_TAG | Node.CDATA_TAG | Node.SECTION_TAG | Node. ASP_TAG | Node.JSTE_TAG | Node.PHP_TAG | Node.XML_DECLtextarray- array of bytes contained in the Nodestart- start positionend- end positionelement- tag namett- tag table instance
-
-
Method Detail
-
getAttrByName
public AttVal getAttrByName(java.lang.String name)
Returns an attribute with the given name in the current node.- Parameters:
name- attribute name.- Returns:
- AttVal instance or null if no attribute with the iven name is found
-
checkAttributes
public void checkAttributes(Lexer lexer)
Default method for checking an element's attributes.- Parameters:
lexer- Lexer
-
repairDuplicateAttributes
public void repairDuplicateAttributes(Lexer lexer)
The same attribute name can't be used more than once in each element. Discard or join attributes according to configuration.- Parameters:
lexer- Lexer
-
addAttribute
public void addAttribute(java.lang.String name, java.lang.String value)Adds an attribute to the node.- Parameters:
name- attribute namevalue- attribute value
-
removeAttribute
public void removeAttribute(AttVal attr)
Remove an attribute from node and then free it.- Parameters:
attr- attribute to remove
-
findDocType
public Node findDocType()
Find the doctype element.- Returns:
- doctype node or null if not found
-
discardDocType
public void discardDocType()
Discard the doctype node.
-
discardElement
public static Node discardElement(Node element)
Remove node from markup tree and discard it.- Parameters:
element- discarded node- Returns:
- next node
-
insertNodeAtStart
public void insertNodeAtStart(Node node)
Insert a node into markup tree.- Parameters:
node- to insert
-
insertNodeAtEnd
public void insertNodeAtEnd(Node node)
Insert node into markup tree.- Parameters:
node- Node to insert
-
insertNodeAsParent
public static void insertNodeAsParent(Node element, Node node)
Insert node into markup tree in pace of element which is moved to become the child of the node.- Parameters:
element- child node. Will be inserted as a child of elementnode- parent node
-
insertNodeBeforeElement
public static void insertNodeBeforeElement(Node element, Node node)
Insert node into markup tree before element.- Parameters:
element- child node. Will be insertedbefore elementnode- following node
-
insertNodeAfterElement
public void insertNodeAfterElement(Node node)
Insert node into markup tree after element.- Parameters:
node- new node to insert
-
trimEmptyElement
public static void trimEmptyElement(Lexer lexer, Node element)
Trim an empty element.- Parameters:
lexer- Lexerelement- empty node to be removed
-
trimTrailingSpace
public static void trimTrailingSpace(Lexer lexer, Node element, Node last)
This maps hello world to hello world . If last child of element is a text node then trim trailing white space character moving it to after element's end tag.- Parameters:
lexer- Lexerelement- nodelast- last child of element
-
escapeTag
protected static Node escapeTag(Lexer lexer, Node element)
Escapes the given tag.- Parameters:
lexer- Lexerelement- node to be escaped- Returns:
- escaped node
-
isBlank
public boolean isBlank(Lexer lexer)
Is the node content empty or blank? Assumes node is a text node.- Parameters:
lexer- Lexer- Returns:
trueif the node content empty or blank
-
trimInitialSpace
public static void trimInitialSpace(Lexer lexer, Node element, Node text)
This maps<p> hello <em> world </em>to<p> hello <em> world </em>. Trims initial space, by moving it before the start tag, or if this element is the first in parent's content, then by discarding the space.- Parameters:
lexer- Lexerelement- parent nodetext- text node
-
trimSpaces
public static void trimSpaces(Lexer lexer, Node element)
Move initial and trailing space out. This routine maps: hello world to hello world and hello world to hello world .- Parameters:
lexer- Lexerelement- Node
-
isDescendantOf
public boolean isDescendantOf(Dict tag)
Is this node contained in a given tag?- Parameters:
tag- descendant tag- Returns:
trueif node is contained in tag
-
insertDocType
public static void insertDocType(Lexer lexer, Node element, Node doctype)
The doctype has been found after other tags, and needs moving to before the html element.- Parameters:
lexer- Lexerelement- documentdoctype- doctype node to insert at the beginning of element
-
findBody
public Node findBody(TagTable tt)
Find the body node.- Parameters:
tt- tag table- Returns:
- body node
-
isElement
public boolean isElement()
Is the node an element?- Returns:
trueif type is START_TAG | START_END_TAG
-
moveBeforeTable
public static void moveBeforeTable(Node row, Node node, TagTable tt)
Unexpected content in table row is moved to just before the table in accordance with Netscape and IE. This code assumes that node hasn't been inserted into the row.- Parameters:
row- Row nodenode- Node which should be moved before the tablett- tag table
-
fixEmptyRow
public static void fixEmptyRow(Lexer lexer, Node row)
If a table row is empty then insert an empty cell.This practice is consistent with browser behavior and avoids potential problems with row spanning cells.- Parameters:
lexer- Lexerrow- row node
-
coerceNode
public static void coerceNode(Lexer lexer, Node node, Dict tag)
Coerce a node.- Parameters:
lexer- Lexernode- Nodetag- tag dictionary reference
-
removeNode
public void removeNode()
Extract this node and its children from a markup tree.
-
insertMisc
public static boolean insertMisc(Node element, Node node)
Insert a node at the end.- Parameters:
element- parent nodenode- will be inserted at the end of element- Returns:
trueif the node has been inserted
-
isNewNode
public boolean isNewNode()
Is this a new (user defined) node? Used to determine how attributes without values should be printed. This was introduced to deal with user defined tags e.g. Cold Fusion.- Returns:
trueif this node represents a user-defined tag.
-
hasOneChild
public boolean hasOneChild()
Does the node have one (and only one) child?- Returns:
trueif the node has one child
-
findHTML
public Node findHTML(TagTable tt)
Find the "html" element.- Parameters:
tt- tag table- Returns:
- html node
-
findHEAD
public Node findHEAD(TagTable tt)
Find the head tag.- Parameters:
tt- tag table- Returns:
- head node
-
checkNodeIntegrity
public boolean checkNodeIntegrity()
Checks for node integrity.- Returns:
- false if node is not consistent
-
addClass
public void addClass(java.lang.String classname)
Add a css class to the node. If a class attribute already exists adds the value to the existing attribute.- Parameters:
classname- css class name
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object- See Also:
Object.toString()
-
getAdapter
protected org.w3c.dom.Node getAdapter()
Returns a DOM Node which wrap the current tidy Node.- Returns:
- org.w3c.dom.Node instance
-
cloneNode
protected Node cloneNode(boolean deep)
Clone this node.- Parameters:
deep- if true deep clone the node (also clones all the contained nodes)- Returns:
- cloned node
-
setType
protected void setType(short newType)
Setter for node type.- Parameters:
newType- a valid node type constant
-
isJavaScript
public boolean isJavaScript()
Used to check script node for script language.- Returns:
trueif the script node contains javascript
-
expectsContent
public boolean expectsContent()
Does the node expect contents?- Returns:
falseif this node should be empty
-
-