Package org.w3c.tidy
Class Configuration
- java.lang.Object
-
- org.w3c.tidy.Configuration
-
- All Implemented Interfaces:
java.io.Serializable
public class Configuration extends java.lang.Object implements java.io.SerializableRead configuration file and manage configuration properties. Configuration files associate a property name with a value. The format is that of a Java .properties file.- Version:
- $Revision$ ($Author$)
- Author:
- Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected java.lang.StringaltTextdefault text for alt attribute.static intASCIIDeprecated.protected booleanasciiCharsconvert quotes and dashes to nearest ASCII char.static intBIG5Deprecated.protected booleanbodyOnlyoutput BODY content only.protected booleanbreakBeforeBRo/p newline before br or not?protected booleanburstSlidescreate slides on each h2 element.protected java.lang.StringcssPrefixCSS class naming for -clean option.protected intdefinedTagstrack what types of tags user has defined to eliminate unnecessary searches.static intDOCTYPE_AUTOtreatment of doctype: auto.static intDOCTYPE_LOOSEtreatment of doctype: loose.static intDOCTYPE_OMITtreatment of doctype: omit.static intDOCTYPE_STRICTtreatment of doctype: strict.static intDOCTYPE_USERtreatment of doctype: user.protected intdocTypeModesee doctype property.protected java.lang.StringdocTypeStruser specified doctype.protected booleandropEmptyParasdiscard empty p elements.protected booleandropFontTagsdiscard presentation tags.protected booleandropProprietaryAttributesdiscard proprietary attributes.protected booleandropProprietaryTagsdiscard proprietary tags.protected intduplicateAttrsKeep first or last duplicate attribute.protected booleanemacsif true format error output for GNU Emacs.protected booleanencloseBlockTextif yes text in blocks is wrapped in p's.protected booleanencloseBodyTextif yes text at body is wrapped in p's.protected java.lang.Stringerrfilefile name to write errors to.protected booleanescapeCdatareplace CDATA sections with escaped text.protected booleanfixBackslashfix URLs by replacing \ with /.protected booleanfixCommentsfix comments with adjacent hyphens.protected booleanfixUriproperly escape URLs.protected booleanforceOutputoutput document even if errors were found.protected booleanhideCommentshides all (real) comments in output.protected booleanhideEndTagssuppress optional end tags.protected booleanhtmlOutoutput plain-old HTML, even for XHTML input.protected booleanindentAttributesnewline+indent before each attribute.protected booleanindentCdataindent CDATA sections.protected booleanindentContentindent content of appropriate tags.static intISO2022Deprecated.protected booleanjoinClassesjoin multiple class attributes.protected booleanjoinStylesjoin multiple style attributes.static intKEEP_FIRSTKeep first duplicate attribute.static intKEEP_LASTKeep last duplicate attribute.protected booleankeepFileTimesif yes last modied time is preserved.protected java.lang.StringlanguageRJ language property.static intLATIN1Deprecated.protected booleanliteralAttribsif true attributes may use newlines.protected booleanlogicalEmphasisreplace i by em and b by strong.protected booleanlowerLiteralsfolds known attribute values to lower case.static intMACROMANDeprecated.protected booleanmakeBareMake bare HTML: remove Microsoft cruft.protected booleanmakeCleanremove presentational clutter.protected booleanncrallow numeric character references.protected char[]newlinebytes for the newline marker.protected booleannumEntitiesuse numeric entities.protected booleanonlyErrorsif true normal output is suppressed.protected booleanquietno 'Parsing X', guessed DTD or summary.protected booleanquoteAmpersandoutput naked ampersand as &.protected booleanquoteMarksoutput " marks as ".protected booleanquoteNbspoutput non-breaking space as entity.static intRAWDeprecated.useTidy.setRawOut(true)for raw outputprotected booleanrawOutAvoid mapping values > 127 to entities.protected booleanreplaceColorreplace hex color attribute values with names.protected java.lang.StringreplacementCharEncodingchar encoding used when replacing illegal SGML chars, regardless of specified encoding.protected ReportreportReport instance.static intSHIFTJISDeprecated.protected intshowErrorsnumber of errors to put out.protected booleanshowWarningshowever errors are always shown.protected java.lang.StringslidestyleDeprecated.does nothingprotected booleansmartIndentdoes text/block level content effect indentation.protected intspacesdefault indentation.protected inttabsizedefault tab size (8).protected booleantidyMarkadd meta element indicating tidied doc.protected booleantrimEmptytrim empty elements.protected TagTablettTagTable associated with this Configuration.protected booleanupperCaseAttrsoutput attributes in upper not lower case.protected booleanupperCaseTagsoutput tags in upper not lower case.static intUTF16Deprecated.static intUTF16BEDeprecated.static intUTF16LEDeprecated.static intUTF8Deprecated.static intWIN1252Deprecated.protected booleanword2000draconian cleaning for Word2000.protected booleanwrapAspwrap within ASP pseudo elements.protected booleanwrapAttValswrap within attribute values.protected booleanwrapJstewrap within JSTE pseudo elements.protected intwraplendefault wrap margin (68).protected booleanwrapPhpwrap within PHP pseudo elements.protected booleanwrapScriptletswrap within JavaScript string literals.protected booleanwrapSectionwrap within CDATA section tags.protected booleanwritebackif true then output tidied markup.protected booleanxHTMLoutput extensible HTML.protected booleanxmlOutcreate output as XML.protected booleanxmlPiadd<?xml?>for XML docs.protected booleanxmlPIsIf set to yes PIs must end with?>.protected booleanxmlSpaceif set to yes adds xml:space attr as needed.protected booleanxmlTagstreat input as XML.
-
Constructor Summary
Constructors Modifier Constructor Description protectedConfiguration(Report report)Instantiates a new Configuration.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidaddProps(java.util.Properties p)adds configuration Properties.voidadjust()Ensure that config is self consistent.protected java.lang.StringconvertCharEncoding(int code)Convert a char encoding from the deprecated tidy constant to a standard java encoding name.protected java.lang.StringgetInCharEncodingName()Getter forinCharEncodingName.protected java.lang.StringgetOutCharEncodingName()Getter foroutCharEncodingName.static booleanisKnownOption(java.lang.String name)Is the given String a valid configuration flag?voidparseFile(java.lang.String filename)Parses a property file.voidprintConfigOptions(java.io.Writer errout, boolean showActualConfiguration)prints available configuration options.protected voidsetInCharEncoding(int encoding)Deprecated.use setInCharEncodingName(String)protected voidsetInCharEncodingName(java.lang.String encoding)Setter forinCharEncodingName.protected voidsetInOutEncodingName(java.lang.String encoding)Setter forinOutCharEncodingName.protected voidsetOutCharEncoding(int encoding)Deprecated.use setOutCharEncodingName(String)protected voidsetOutCharEncodingName(java.lang.String encoding)Setter foroutCharEncodingName.
-
-
-
Field Detail
-
RAW
@Deprecated public static final int RAW
Deprecated.useTidy.setRawOut(true)for raw outputcharacter encoding = RAW.- See Also:
- Constant Field Values
-
ASCII
@Deprecated public static final int ASCII
Deprecated.character encoding = ASCII.- See Also:
- Constant Field Values
-
LATIN1
@Deprecated public static final int LATIN1
Deprecated.character encoding = LATIN1.- See Also:
- Constant Field Values
-
UTF8
@Deprecated public static final int UTF8
Deprecated.character encoding = UTF8.- See Also:
- Constant Field Values
-
ISO2022
@Deprecated public static final int ISO2022
Deprecated.character encoding = ISO2022.- See Also:
- Constant Field Values
-
MACROMAN
@Deprecated public static final int MACROMAN
Deprecated.character encoding = MACROMAN.- See Also:
- Constant Field Values
-
UTF16LE
@Deprecated public static final int UTF16LE
Deprecated.character encoding = UTF16LE.- See Also:
- Constant Field Values
-
UTF16BE
@Deprecated public static final int UTF16BE
Deprecated.character encoding = UTF16BE.- See Also:
- Constant Field Values
-
UTF16
@Deprecated public static final int UTF16
Deprecated.character encoding = UTF16.- See Also:
- Constant Field Values
-
WIN1252
@Deprecated public static final int WIN1252
Deprecated.character encoding = WIN1252.- See Also:
- Constant Field Values
-
BIG5
@Deprecated public static final int BIG5
Deprecated.character encoding = BIG5.- See Also:
- Constant Field Values
-
SHIFTJIS
@Deprecated public static final int SHIFTJIS
Deprecated.character encoding = SHIFTJIS.- See Also:
- Constant Field Values
-
DOCTYPE_OMIT
public static final int DOCTYPE_OMIT
treatment of doctype: omit. TODO should be an enumeration DocTypeMode- See Also:
- Constant Field Values
-
DOCTYPE_AUTO
public static final int DOCTYPE_AUTO
treatment of doctype: auto.- See Also:
- Constant Field Values
-
DOCTYPE_STRICT
public static final int DOCTYPE_STRICT
treatment of doctype: strict.- See Also:
- Constant Field Values
-
DOCTYPE_LOOSE
public static final int DOCTYPE_LOOSE
treatment of doctype: loose.- See Also:
- Constant Field Values
-
DOCTYPE_USER
public static final int DOCTYPE_USER
treatment of doctype: user.- See Also:
- Constant Field Values
-
KEEP_LAST
public static final int KEEP_LAST
Keep last duplicate attribute. TODO should be an enumeration DupAttrMode- See Also:
- Constant Field Values
-
KEEP_FIRST
public static final int KEEP_FIRST
Keep first duplicate attribute.- See Also:
- Constant Field Values
-
spaces
protected int spaces
default indentation.
-
wraplen
protected int wraplen
default wrap margin (68).
-
tabsize
protected int tabsize
default tab size (8).
-
docTypeMode
protected int docTypeMode
see doctype property.
-
duplicateAttrs
protected int duplicateAttrs
Keep first or last duplicate attribute.
-
altText
protected java.lang.String altText
default text for alt attribute.
-
slidestyle
@Deprecated protected java.lang.String slidestyle
Deprecated.does nothingstyle sheet for slides.
-
language
protected java.lang.String language
RJ language property.
-
docTypeStr
protected java.lang.String docTypeStr
user specified doctype.
-
errfile
protected java.lang.String errfile
file name to write errors to.
-
writeback
protected boolean writeback
if true then output tidied markup.
-
onlyErrors
protected boolean onlyErrors
if true normal output is suppressed.
-
showWarnings
protected boolean showWarnings
however errors are always shown.
-
quiet
protected boolean quiet
no 'Parsing X', guessed DTD or summary.
-
indentContent
protected boolean indentContent
indent content of appropriate tags.
-
smartIndent
protected boolean smartIndent
does text/block level content effect indentation.
-
hideEndTags
protected boolean hideEndTags
suppress optional end tags.
-
xmlTags
protected boolean xmlTags
treat input as XML.
-
xmlOut
protected boolean xmlOut
create output as XML.
-
xHTML
protected boolean xHTML
output extensible HTML.
-
htmlOut
protected boolean htmlOut
output plain-old HTML, even for XHTML input. Yes means set explicitly.
-
xmlPi
protected boolean xmlPi
add<?xml?>for XML docs.
-
upperCaseTags
protected boolean upperCaseTags
output tags in upper not lower case.
-
upperCaseAttrs
protected boolean upperCaseAttrs
output attributes in upper not lower case.
-
makeClean
protected boolean makeClean
remove presentational clutter.
-
makeBare
protected boolean makeBare
Make bare HTML: remove Microsoft cruft.
-
logicalEmphasis
protected boolean logicalEmphasis
replace i by em and b by strong.
-
dropFontTags
protected boolean dropFontTags
discard presentation tags.
-
dropProprietaryAttributes
protected boolean dropProprietaryAttributes
discard proprietary attributes.
-
dropProprietaryTags
protected boolean dropProprietaryTags
discard proprietary tags.
-
dropEmptyParas
protected boolean dropEmptyParas
discard empty p elements.
-
fixComments
protected boolean fixComments
fix comments with adjacent hyphens.
-
trimEmpty
protected boolean trimEmpty
trim empty elements.
-
breakBeforeBR
protected boolean breakBeforeBR
o/p newline before br or not?
-
burstSlides
protected boolean burstSlides
create slides on each h2 element.
-
numEntities
protected boolean numEntities
use numeric entities.
-
quoteMarks
protected boolean quoteMarks
output " marks as ".
-
quoteNbsp
protected boolean quoteNbsp
output non-breaking space as entity.
-
quoteAmpersand
protected boolean quoteAmpersand
output naked ampersand as &.
-
wrapAttVals
protected boolean wrapAttVals
wrap within attribute values.
-
wrapScriptlets
protected boolean wrapScriptlets
wrap within JavaScript string literals.
-
wrapSection
protected boolean wrapSection
wrap within CDATA section tags.
-
wrapAsp
protected boolean wrapAsp
wrap within ASP pseudo elements.
-
wrapJste
protected boolean wrapJste
wrap within JSTE pseudo elements.
-
wrapPhp
protected boolean wrapPhp
wrap within PHP pseudo elements.
-
fixBackslash
protected boolean fixBackslash
fix URLs by replacing \ with /.
-
indentAttributes
protected boolean indentAttributes
newline+indent before each attribute.
-
xmlPIs
protected boolean xmlPIs
If set to yes PIs must end with?>.
-
xmlSpace
protected boolean xmlSpace
if set to yes adds xml:space attr as needed.
-
encloseBodyText
protected boolean encloseBodyText
if yes text at body is wrapped in p's.
-
encloseBlockText
protected boolean encloseBlockText
if yes text in blocks is wrapped in p's.
-
keepFileTimes
protected boolean keepFileTimes
if yes last modied time is preserved.
-
word2000
protected boolean word2000
draconian cleaning for Word2000.
-
tidyMark
protected boolean tidyMark
add meta element indicating tidied doc.
-
emacs
protected boolean emacs
if true format error output for GNU Emacs.
-
literalAttribs
protected boolean literalAttribs
if true attributes may use newlines.
-
bodyOnly
protected boolean bodyOnly
output BODY content only.
-
fixUri
protected boolean fixUri
properly escape URLs.
-
lowerLiterals
protected boolean lowerLiterals
folds known attribute values to lower case.
-
replaceColor
protected boolean replaceColor
replace hex color attribute values with names.
-
hideComments
protected boolean hideComments
hides all (real) comments in output.
-
indentCdata
protected boolean indentCdata
indent CDATA sections.
-
forceOutput
protected boolean forceOutput
output document even if errors were found.
-
showErrors
protected int showErrors
number of errors to put out.
-
asciiChars
protected boolean asciiChars
convert quotes and dashes to nearest ASCII char.
-
joinClasses
protected boolean joinClasses
join multiple class attributes.
-
joinStyles
protected boolean joinStyles
join multiple style attributes.
-
escapeCdata
protected boolean escapeCdata
replace CDATA sections with escaped text.
-
ncr
protected boolean ncr
allow numeric character references.
-
cssPrefix
protected java.lang.String cssPrefix
CSS class naming for -clean option.
-
replacementCharEncoding
protected java.lang.String replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding.
-
tt
protected TagTable tt
TagTable associated with this Configuration.
-
report
protected Report report
Report instance. Used for messages.
-
definedTags
protected int definedTags
track what types of tags user has defined to eliminate unnecessary searches.
-
newline
protected char[] newline
bytes for the newline marker.
-
rawOut
protected boolean rawOut
Avoid mapping values > 127 to entities.
-
-
Constructor Detail
-
Configuration
protected Configuration(Report report)
Instantiates a new Configuration. This method should be called by Tidy only.- Parameters:
report- Report instance
-
-
Method Detail
-
addProps
public void addProps(java.util.Properties p)
adds configuration Properties.- Parameters:
p- Properties
-
parseFile
public void parseFile(java.lang.String filename)
Parses a property file.- Parameters:
filename- file name
-
isKnownOption
public static boolean isKnownOption(java.lang.String name)
Is the given String a valid configuration flag?- Parameters:
name- configuration parameter name- Returns:
trueif the given String is a valid config option
-
adjust
public void adjust()
Ensure that config is self consistent.
-
printConfigOptions
public void printConfigOptions(java.io.Writer errout, boolean showActualConfiguration)prints available configuration options.- Parameters:
errout- where to writeshowActualConfiguration- print actual configuration values
-
getInCharEncodingName
protected java.lang.String getInCharEncodingName()
Getter forinCharEncodingName.- Returns:
- Returns the inCharEncodingName.
-
setInCharEncodingName
protected void setInCharEncodingName(java.lang.String encoding)
Setter forinCharEncodingName.- Parameters:
encoding- The inCharEncodingName to set.
-
getOutCharEncodingName
protected java.lang.String getOutCharEncodingName()
Getter foroutCharEncodingName.- Returns:
- Returns the outCharEncodingName.
-
setOutCharEncodingName
protected void setOutCharEncodingName(java.lang.String encoding)
Setter foroutCharEncodingName.- Parameters:
encoding- The outCharEncodingName to set.
-
setInOutEncodingName
protected void setInOutEncodingName(java.lang.String encoding)
Setter forinOutCharEncodingName.- Parameters:
encoding- The CharEncodingName to set.
-
setOutCharEncoding
@Deprecated protected void setOutCharEncoding(int encoding)
Deprecated.use setOutCharEncodingName(String)Setter foroutCharEncoding.- Parameters:
encoding- The outCharEncoding to set.
-
setInCharEncoding
@Deprecated protected void setInCharEncoding(int encoding)
Deprecated.use setInCharEncodingName(String)Setter forinCharEncoding.- Parameters:
encoding- The inCharEncoding to set.
-
convertCharEncoding
protected java.lang.String convertCharEncoding(int code)
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.- Parameters:
code- encoding code- Returns:
- encoding name
-
-