Class Xhtml5BaseParser
- java.lang.Object
-
- org.apache.maven.doxia.parser.AbstractParser
-
- org.apache.maven.doxia.parser.AbstractXmlParser
-
- org.apache.maven.doxia.parser.Xhtml5BaseParser
-
- All Implemented Interfaces:
MacroExecutor,HtmlMarkup,Markup,XmlMarkup,Parser
- Direct Known Subclasses:
Xhtml1BaseParser
public class Xhtml5BaseParser extends AbstractXmlParser implements HtmlMarkup
Common base parser for XHTML5 (now HTML Living standard, XML syntax) elements and attributes.- See Also:
- HTML Living standard, history
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.maven.doxia.parser.AbstractXmlParser
AbstractXmlParser.CachedFileEntityResolver
-
-
Field Summary
Fields Modifier and Type Field Description protected booleanisBeginningOfLineInsideBlockIf true, the next text event is at the beginning of a line inside a block element, i.e.-
Fields inherited from interface org.apache.maven.doxia.markup.HtmlMarkup
A, ABBR, ADDRESS, AREA, ARTICLE, ASIDE, AUDIO, B, BASE, BDI, BDO, BLOCKQUOTE, BODY, BR, BUTTON, CANVAS, CAPTION, CDATA_TYPE, CITE, CODE, COL, COLGROUP, COMMAND, DATA, DATALIST, DD, DEL, DETAILS, DFN, DIALOG, DIV, DL, DT, EM, EMBED, ENTITY_TYPE, FIELDSET, FIGCAPTION, FIGURE, FOOTER, FORM, H1, H2, H3, H4, H5, H6, HEAD, HEADER, HGROUP, HR, HTML, I, IFRAME, IMG, INPUT, INS, KBD, KEYGEN, LABEL, LEGEND, LI, LINK, MAIN, MAP, MARK, MENU, MENUITEM, META, METER, NAV, NOSCRIPT, OBJECT, OL, OPTGROUP, OPTION, OUTPUT, P, PARAM, PICTURE, PRE, PROGRESS, Q, RB, RP, RT, RTC, RUBY, S, SAMP, SCRIPT, SECTION, SELECT, SMALL, SOURCE, SPAN, STRONG, STYLE, SUB, SUMMARY, SUP, SVG, TABLE, TAG_TYPE_END, TAG_TYPE_SIMPLE, TAG_TYPE_START, TBODY, TD, TEMPLATE, TEXTAREA, TFOOT, TH, THEAD, TIME, TITLE, TR, TRACK, U, UL, VAR, VIDEO, WBR
-
Fields inherited from interface org.apache.maven.doxia.markup.Markup
COLON, EOL, EQUAL, GREATER_THAN, LEFT_CURLY_BRACKET, LEFT_SQUARE_BRACKET, LESS_THAN, MINUS, PLUS, QUOTE, RIGHT_CURLY_BRACKET, RIGHT_SQUARE_BRACKET, SEMICOLON, SLASH, SPACE, STAR
-
Fields inherited from interface org.apache.maven.doxia.parser.Parser
TXT_TYPE, UNKNOWN_TYPE, XML_TYPE
-
Fields inherited from interface org.apache.maven.doxia.markup.XmlMarkup
BANG, CDATA, DOCTYPE_START, ENTITY_START, XML_NAMESPACE
-
-
Constructor Summary
Constructors Constructor Description Xhtml5BaseParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected booleanbaseEndTag(java.lang.String elementName, SinkEventAttributeSet attribs, org.apache.maven.doxia.sink.Sink sink)protected booleanbaseEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Goes through a common list of possible html end tags.protected booleanbaseStartTag(java.lang.String elementName, SinkEventAttributeSet attribs, org.apache.maven.doxia.sink.Sink sink)protected booleanbaseStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Goes through a common list of possible html5 start tags.protected voidconsecutiveSections(int newLevel, org.apache.maven.doxia.sink.Sink sink, SinkEventAttributeSet attribs)Deprecated.UseemitHeadingSections(int, Sink, boolean)instead.protected voidemitHeadingSections(int newLevel, org.apache.maven.doxia.sink.Sink sink, boolean enforceNewSection)Make sure sections are nested consecutively and correctly inserted for the given heading levelprotected intgetSectionLevel()Return the current section level.protected voidhandleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Handles CDATA sections.protected voidhandleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Handles comments.protected voidhandleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Goes through the possible end tags.protected voidhandleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Goes through the possible start tags.protected voidhandleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Handles text events.protected voidinit()Initialize the parser.protected voidinitXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)Initializes the parser with custom entities or other options.protected booleanisScriptBlock()Checks if we are currently inside a <script> tag.protected booleanisVerbatim()Checks if we are currently inside a <pre> tag.voidparse(java.io.Reader source, org.apache.maven.doxia.sink.Sink sink, java.lang.String reference)Parses the given source model and emits Doxia events into the given sink.protected voidprocessInsignificantLineBreaks(org.apache.maven.doxia.sink.Sink sink, java.lang.String text)Process all line-breaks in the given text which are not significant for the output, i.e.protected voidsetSectionLevel(int newLevel)Set the current section level.protected java.lang.StringvalidAnchor(java.lang.String id)Checks if the given id is a valid Doxia id and if not, returns a transformed one.protected voidverbatim()Start verbatim mode.protected voidverbatim_()Stop verbatim mode.-
Methods inherited from class org.apache.maven.doxia.parser.AbstractXmlParser
getAddDefaultEntities, getAttributesFromParser, getLocalEntities, getText, getType, handleEntity, handleUnknown, handleUnknown, isCollapsibleWhitespace, isIgnorableWhitespace, isTrimmableWhitespace, isValidate, setAddDefaultEntities, setCollapsibleWhitespace, setIgnorableWhitespace, setTrimmableWhitespace, setValidate
-
Methods inherited from class org.apache.maven.doxia.parser.AbstractParser
addSinkWrapperFactory, doxiaVersion, executeMacro, getBasedir, getMacroManager, getSinkWrapperFactories, getWrappedSink, isEmitAnchorsForIndexableEntries, isEmitComments, isSecondParsing, parse, parse, parse, setEmitAnchorsForIndexableEntries, setEmitComments, setMacroExecutor, setSecondParsing
-
-
-
-
Method Detail
-
parse
public void parse(java.io.Reader source, org.apache.maven.doxia.sink.Sink sink, java.lang.String reference) throws ParseExceptionDescription copied from interface:ParserParses the given source model and emits Doxia events into the given sink.- Specified by:
parsein interfaceParser- Overrides:
parsein classAbstractXmlParser- Parameters:
source- not null reader that provides the source document.sink- A sink that consumes the Doxia events.reference- a string identifying the source (for file based documents the source file path)- Throws:
ParseException- if the model could not be parsed.
-
initXmlParser
protected void initXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionInitializes the parser with custom entities or other options. Adds all XHTML (HTML 5.2) entities to the parser so that they can be recognized and resolved without additional DTD.- Overrides:
initXmlParserin classAbstractXmlParser- Parameters:
parser- A parser, not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem initializing the parser
-
baseStartTag
protected boolean baseStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Goes through a common list of possible html5 start tags. These include only tags that can go into the body of an xhtml5 document and so should be re-usable by different xhtml-based parsers.
The currently handled tags are:
<article>, <nav>, <aside>, <section>, <h1>, <h2>, <h3>, <h4>, <h5>, <header>, <main>, <footer>, <em>, <strong>, <small>, <s>, <cite>, <q>, <dfn>, <abbr>, <i>, <b>, <code>, <samp>, <kbd>, <sub>, <sup>, <u>, <mark>, <ruby>, <rb>, <rt>, <rtc>, <rp>, <bdi>, <bdo>, <span>, <ins>, <del>, <p>, <pre>, <ul>, <ol>, <li>, <dl>, <dt>, <dd>, <a>, <table>, <tr>, <th>, <td>, <caption>, <br/>, <wbr/>, <hr/>, <img/>.- Parameters:
parser- A parser.sink- the sink to receive the events.- Returns:
- True if the event has been handled by this method, i.e. the tag was recognized, false otherwise.
-
baseStartTag
protected boolean baseStartTag(java.lang.String elementName, SinkEventAttributeSet attribs, org.apache.maven.doxia.sink.Sink sink)
-
baseEndTag
protected boolean baseEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink)Goes through a common list of possible html end tags. These should be re-usable by different xhtml-based parsers. The tags handled here are the same as for
baseStartTag(XmlPullParser,Sink), except for the empty elements (<br/>, <hr/>, <img/>).- Parameters:
parser- A parser.sink- the sink to receive the events.- Returns:
- True if the event has been handled by this method, false otherwise.
-
baseEndTag
protected boolean baseEndTag(java.lang.String elementName, SinkEventAttributeSet attribs, org.apache.maven.doxia.sink.Sink sink)
-
handleStartTag
protected void handleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionExceptionGoes through the possible start tags. Just callsbaseStartTag(XmlPullParser,Sink), this should be overridden by implementing parsers to include additional tags.- Specified by:
handleStartTagin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the modelMacroExecutionException- if there's a problem executing a macro
-
handleEndTag
protected void handleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionExceptionGoes through the possible end tags. Just callsbaseEndTag(XmlPullParser,Sink), this should be overridden by implementing parsers to include additional tags.- Specified by:
handleEndTagin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the modelMacroExecutionException- if there's a problem executing a macro
-
handleText
protected void handleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionDescription copied from class:AbstractXmlParserHandles text events.This is a default implementation, if the parser points to a non-empty text element, it is emitted as a text event into the specified sink.
- Overrides:
handleTextin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
processInsignificantLineBreaks
protected void processInsignificantLineBreaks(org.apache.maven.doxia.sink.Sink sink, java.lang.String text)Process all line-breaks in the given text which are not significant for the output, i.e. all line-breaks which are not within a verbatim block and are at the beginning of the given text. In addition it emits information about the whitespace characters following the line-breaks as they may be relevant for the output (e.g. for indentation).- Parameters:
sink- the sink to receive the events.text- the text to process.
-
handleComment
protected void handleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionDescription copied from class:AbstractXmlParserHandles comments.This is a default implementation, all data are emitted as comment events into the specified sink.
- Overrides:
handleCommentin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
handleCdsect
protected void handleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, org.apache.maven.doxia.sink.Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionDescription copied from class:AbstractXmlParserHandles CDATA sections.This is a default implementation, all data are emitted as text events into the specified sink.
- Overrides:
handleCdsectin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
consecutiveSections
@Deprecated protected void consecutiveSections(int newLevel, org.apache.maven.doxia.sink.Sink sink, SinkEventAttributeSet attribs)Deprecated.UseemitHeadingSections(int, Sink, boolean)instead.Shortcut foremitHeadingSections(int, Sink, boolean)with last argument beingtrue- Parameters:
newLevel-sink-attribs-
-
emitHeadingSections
protected void emitHeadingSections(int newLevel, org.apache.maven.doxia.sink.Sink sink, boolean enforceNewSection)Make sure sections are nested consecutively and correctly inserted for the given heading levelHTML5 heading tags H1 to H5 imply same level sections in Sink API (compare with
Sink.sectionTitle(int, SinkEventAttributes)). However (X)HTML5 allows headings without explicit surrounding section elements and is also less strict with non-consecutive heading levels. This methods both closes open sections which have been added for previous headings and/or opens sections necessary for the new heading level. At least one section needs to be opened directly prior the heading due to Sink API restrictions.For instance, if the following sequence is parsed:
<h2></h2> <h5></h5>
we have to insert two section starts before we open the
<h5>. In the following sequence<h5></h5> <h2></h2>
we have to close two sections before we open the
<h2>.The current heading level is set to newLevel afterwards.
- Parameters:
newLevel- the new section level, all upper levels have to be closed.sink- the sink to receive the events.enforceNewSection- whether to enforce a new section or not
-
getSectionLevel
protected int getSectionLevel()
Return the current section level.- Returns:
- the current section level.
-
setSectionLevel
protected void setSectionLevel(int newLevel)
Set the current section level.- Parameters:
newLevel- the new section level.
-
verbatim_
protected void verbatim_()
Stop verbatim mode.
-
verbatim
protected void verbatim()
Start verbatim mode.
-
isVerbatim
protected boolean isVerbatim()
Checks if we are currently inside a <pre> tag.- Returns:
- true if we are currently in verbatim mode.
-
isScriptBlock
protected boolean isScriptBlock()
Checks if we are currently inside a <script> tag.- Returns:
- true if we are currently inside
<script>tags. - Since:
- 1.1.1.
-
validAnchor
protected java.lang.String validAnchor(java.lang.String id)
Checks if the given id is a valid Doxia id and if not, returns a transformed one.- Parameters:
id- The id to validate.- Returns:
- A transformed id or the original id if it was already valid.
- See Also:
DoxiaUtils.encodeId(String)
-
init
protected void init()
Description copied from class:AbstractParserInitialize the parser. This is called first byAbstractParser.parse(java.io.Reader, org.apache.maven.doxia.sink.Sink)and can be used to set the parser into a clear state so it can be re-used.- Overrides:
initin classAbstractParser
-
-