Package jodd.io
Class UnicodeInputStream
java.lang.Object
java.io.InputStream
jodd.io.UnicodeInputStream
- All Implemented Interfaces:
Closeable,AutoCloseable
Unicode input stream for detecting UTF encodings and reading BOM characters.
Detects following BOMs:
- UTF-8
- UTF-16BE
- UTF-16LE
- UTF-32BE
- UTF-32LE
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final byte[]static final byte[]static final byte[]static final byte[]static final byte[]private intprivate Charsetprivate booleanprivate final PushbackInputStreamstatic final intprivate final Charset -
Constructor Summary
ConstructorsConstructorDescriptionUnicodeInputStream(InputStream in, Charset targetEncoding) Creates new unicode stream. -
Method Summary
Methods inherited from class java.io.InputStream
available, mark, markSupported, nullInputStream, read, read, readAllBytes, readNBytes, readNBytes, reset, skip, skipNBytes, transferTo
-
Field Details
-
MAX_BOM_SIZE
public static final int MAX_BOM_SIZE- See Also:
-
internalInputStream
-
initialized
private boolean initialized -
BOMSize
private int BOMSize -
encoding
-
targetEncoding
-
BOM_UTF32_BE
public static final byte[] BOM_UTF32_BE -
BOM_UTF32_LE
public static final byte[] BOM_UTF32_LE -
BOM_UTF8
public static final byte[] BOM_UTF8 -
BOM_UTF16_BE
public static final byte[] BOM_UTF16_BE -
BOM_UTF16_LE
public static final byte[] BOM_UTF16_LE
-
-
Constructor Details
-
UnicodeInputStream
Creates new unicode stream. It works in two modes: detect mode and read mode.Detect mode is active when target encoding is not specified. In detect mode, it tries to detect encoding from BOM if exist. If BOM doesn't exist, encoding is not detected.
Read mode is active when target encoding is set. Then this stream reads optional BOM for given encoding. If BOM doesn't exist, nothing is skipped.
-
-
Method Details
-
getDetectedEncoding
Returns detected UTF encoding ornullif no UTF encoding has been detected (i.e. no BOM). If stream is not read yet, it will beinitalizedfirst. -
init
Detects and decodes encoding from BOM character. Reads ahead four bytes and check for BOM marks. Extra bytes are unread back to the stream, so only BOM bytes are skipped.- Throws:
IOException
-
close
Closes input stream. If stream was not used, encoding will be unavailable.- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Overrides:
closein classInputStream- Throws:
IOException
-
read
Reads byte from the stream.- Specified by:
readin classInputStream- Throws:
IOException
-
getBOMSize
public int getBOMSize()Returns BOM size in bytes. Returns-1if BOM not found.
-