Class UTF8Writer
java.lang.Object
java.io.Writer
net.sf.saxon.serialize.UTF8Writer
- All Implemented Interfaces:
Closeable, Flushable, Appendable, AutoCloseable, UnicodeWriter
Specialized buffering UTF-8 writer.
The main reason for custom version is to allow for efficient
buffer recycling; the second benefit is that encoder has less
overhead for short content encoding (compared to JDK default
codecs).
- Author:
- Tatu Saloranta. Modified by Michael Kay to enable efficient output of Unicode strings.
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) intWhen outputting chars from BMP, surrogate pairs need to be coalesced.(package private) static final int(package private) static final int(package private) static final int(package private) static final int -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()Complete the writing of characters to the result.voidflush()Flush the contents of any buffers.voidwrite(char[] cbuf) voidwrite(char[] cbuf, int off, int len) voidwrite(int c) Write a single char.voidProcess a supplied stringvoidvoidwrite(UnicodeString chars) Process a supplied stringvoidwriteAscii(byte[] content) Write a sequence of ASCII characters.voidwriteAscii(byte[] chars, int off, int len) Write a sequence of ASCII characters.voidwriteCodePoint(int codepoint) Process a single character.voidwriteLatin1(byte[] bytes, int off, int len) voidwriteRepeatedAscii(byte ch, int repeat) Write an ASCII character repeatedly.Methods inherited from class Writer
append, append, append, nullWriter
-
Field Details
-
SURR1_FIRST
static final int SURR1_FIRST- See Also:
-
SURR1_LAST
static final int SURR1_LAST- See Also:
-
SURR2_FIRST
static final int SURR2_FIRST- See Also:
-
SURR2_LAST
static final int SURR2_LAST- See Also:
-
_surrogate
int _surrogateWhen outputting chars from BMP, surrogate pairs need to be coalesced. To do this, both pairs must be known first; and since it is possible pairs may be split, we need temporary storage for the first half
-
-
Constructor Details
-
UTF8Writer
-
UTF8Writer
-
-
Method Details
-
close
Description copied from interface:UnicodeWriterComplete the writing of characters to the result. The default implementation does nothing.- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Specified by:
closein interfaceUnicodeWriter- Specified by:
closein classWriter- Throws:
IOException- if processing fails for any reason
-
flush
Description copied from interface:UnicodeWriterFlush the contents of any buffers. The default implementation does nothing.- Specified by:
flushin interfaceFlushable- Specified by:
flushin interfaceUnicodeWriter- Specified by:
flushin classWriter- Throws:
IOException- if processing fails for any reason
-
write
- Overrides:
writein classWriter- Throws:
IOException
-
write
- Specified by:
writein classWriter- Throws:
IOException
-
writeLatin1
- Throws:
IOException
-
writeAscii
Write a sequence of ASCII characters. The caller is responsible for ensuring that each byte represents a character in the range 1-127- Specified by:
writeAsciiin interfaceUnicodeWriter- Parameters:
content- the content to be written- Throws:
IOException- if processing fails for any reason
-
writeAscii
Write a sequence of ASCII characters. The caller is responsible for ensuring that each byte represents a character in the range 1-127- Parameters:
chars- the characters to be writtenoff- the offset of the first character to be includedlen- the number of characters to be written- Throws:
IOException
-
writeRepeatedAscii
Write an ASCII character repeatedly. Used for serializing whitespace.- Specified by:
writeRepeatedAsciiin interfaceUnicodeWriter- Parameters:
ch- the ASCII character to be serialized (must be less than 0x7f)repeat- the number of occurrences to output- Throws:
IOException- if it fails
-
writeCodePoint
Process a single character. Default implementation wraps the codepoint into a single-characterUnicodeString- Specified by:
writeCodePointin interfaceUnicodeWriter- Parameters:
codepoint- the character to be processed. Must not be a surrogate- Throws:
IOException- if processing fails for any reason
-
write
Write a single char.Note (MHK) Although the Writer interface says that the top half of the int is ignored, this implementation appears to accept a Unicode codepoint which is output as a 4-byte UTF-8 sequence.
- Overrides:
writein classWriter- Parameters:
c- the char to be written- Throws:
IOException- If an I/O error occurs
-
write
Process a supplied string- Specified by:
writein interfaceUnicodeWriter- Parameters:
chars- the characters to be processed- Throws:
IOException- if processing fails for any reason
-
write
Description copied from interface:UnicodeWriterProcess a supplied string- Specified by:
writein interfaceUnicodeWriter- Overrides:
writein classWriter- Parameters:
str- the characters to be processed- Throws:
IOException- if processing fails for any reason
-
write
- Overrides:
writein classWriter- Throws:
IOException
-