Class LZW
java.lang.Object
org.apache.sis.internal.storage.inflater.PixelChannel
org.apache.sis.internal.storage.inflater.CompressionChannel
org.apache.sis.internal.storage.inflater.LZW
- All Implemented Interfaces:
Closeable,AutoCloseable,Channel,ReadableByteChannel
Inflater for values encoded with the LZW compression.
This compression is described in section 13 of TIFF 6 specification, "LZW Compression".
Each code is written using at least 9 bits and at most 12 bits.
Legal note
Unisys's patent on the LZW algorithm expired in 2004.- Since:
- 1.1
- Version:
- 1.3
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final intA 12 bits code meaning that we have exhausted the 4093 available codes and must reset the table to the initial set of 9 bits code.private intNumber of bits to read for the next code.private final int[]Pointers to byte sequences for a code in theentriesForCodesarray.private static final intEnd of information.private static final intFirst code which is not one of the predefined codes.private intIndex of the next entry available inentriesForCodes.private intIndex of the next byte available instringsFromCode.private static final intThe mask to apply on anentriesForCodeselement for getting the length.private static final intA mask used for detecting when a new allocation is required.private static final intPosition of the lowest bit in anentriesForCodeselement where the offset is stored.private static final intMaximum number of bits in a code, inclusive.private static final intInitial number of bits in a code.private static final intMaximal value + 1 that the offset can take.private static final intThe mask to apply on anentriesForCodeselement for getting the compressed offset (before shifting).private static final intThe shift to apply on a compressed offset (after application ofOFFSET_MASK) for getting the uncompressed offset.private static final intFor computing value ofindexOfFreeEntrywhencodeSizeneeds to be incremented.private intIf some bytes could not be written in previousread(…)execution because the target buffer was full, offset and length of those bytes.private intIf some bytes could not be written in previousread(…)execution because the target buffer was full, offset and length of those bytes.private static final intMask for a bit in anentriesForCodeselement for telling whether the extra space allocated in thestringsFromCodearray has already been used by another entry.private intLast code found in previous iteration.private static final intNumber of bits in an offset that are always 0 and consequently do not need to be stored.private byte[]Sequences of bytes associated to codes.Fields inherited from class org.apache.sis.internal.storage.inflater.CompressionChannel
input, listeners -
Constructor Summary
ConstructorsConstructorDescriptionLZW(ChannelDataInput input, StoreListeners listeners) Creates a new channel which will decompress data from the given input. -
Method Summary
Modifier and TypeMethodDescriptionprivate voidClears theentriesForCodestable.private static intlength(int element) Extracts the number of bytes of an entry stored in thestringsFromCodearray.private static booleannewEntryNeedsAllocation(int element) Returnstrueif all the space allocated for the given entry is already used.private static intoffset(int element) Extracts the index of the first byte of an entry stored in thestringsFromCodearray.private static intoffsetAndLength(int offset, int length) Encodes an offset together with its length.intread(ByteBuffer target) Decompresses some bytes from the input into the given destination buffer.final intReadscodeSizebits from the stream.voidsetInputRegion(long start, long byteCount) Prepares this inflater for reading a new tile or a new band of a tile.private IOExceptionThe exception to throw if the decompression process encounters data that it cannot process.Methods inherited from class org.apache.sis.internal.storage.inflater.CompressionChannel
close, createDataInput, finished, isOpen, repeat, resources
-
Field Details
-
CLEAR_CODE
private static final int CLEAR_CODEA 12 bits code meaning that we have exhausted the 4093 available codes and must reset the table to the initial set of 9 bits code.- See Also:
-
EOI_CODE
private static final int EOI_CODEEnd of information. This code appears at the end of a strip.- See Also:
-
FIRST_ADAPTATIVE_CODE
private static final int FIRST_ADAPTATIVE_CODEFirst code which is not one of the predefined codes.- See Also:
-
OFFSET_TO_MAXIMUM
private static final int OFFSET_TO_MAXIMUMFor computing value ofindexOfFreeEntrywhencodeSizeneeds to be incremented. TIFF specification said that the size needs to be incremented after codes 510, 1022 and 2046 are added to theentriesForCodestable. Those values are a little bit lower than what we would expect if the full integer ranges were used.- See Also:
-
MIN_CODE_SIZE
private static final int MIN_CODE_SIZEInitial number of bits in a code. TIFF specification said that the size needs to be incremented after codes 510, 1022 and 2046 are added to theentriesForCodestable.- See Also:
-
MAX_CODE_SIZE
private static final int MAX_CODE_SIZEMaximum number of bits in a code, inclusive.- See Also:
-
codeSize
private int codeSizeNumber of bits to read for the next code. This number starts at 9 and increases until 12. After 12 bits, aCLEAR_CODEshould occur in the stream of LZW data. -
LOWEST_OFFSET_BIT
private static final int LOWEST_OFFSET_BITPosition of the lowest bit in anentriesForCodeselement where the offset is stored. The position is chosen for leaving 12 bits for storing the length before the offset value.Rational: even in the worst case scenario where the same byte is always appended to the sequence, the maximal length cannot exceeded the dictionary size because aCLEAR_CODEwill be emitted when the dictionary is full.- See Also:
-
LENGTH_MASK
private static final int LENGTH_MASKThe mask to apply on anentriesForCodeselement for getting the length.- See Also:
-
STRING_ALIGNMENT
private static final int STRING_ALIGNMENTNumber of bits in an offset that are always 0 and consequently do not need to be stored. An intentional consequence of this restriction is that size of blocks allocated in thestringsFromCodearray must be multiples of (1 << STRING_ALIGNMENT). It makes possible to use the extra size for growing a string up to that amount of bytes without copying it.Note: doing allocations only by blocks of 2² = 4 bytes may seem a waste of memory, but actually it reduces memory usage a lot (almost a factor 4) because of the copies avoided. We tried with alignment values 1, 2, 3 and found that 2 seems optimal.- See Also:
-
PREALLOCATED_SPACE_IS_USED_MASK
private static final int PREALLOCATED_SPACE_IS_USED_MASKMask for a bit in anentriesForCodeselement for telling whether the extra space allocated in thestringsFromCodearray has already been used by another entry. If yes (1), then that space cannot be used by new entry. Instead, the new entry will need to allocate a new space.Note:
newEntryNeedsAllocation(int)implementation assumes that this bit is the sign bit.- See Also:
-
OFFSET_MASK
private static final int OFFSET_MASKThe mask to apply on anentriesForCodeselement for getting the compressed offset (before shifting).- See Also:
-
OFFSET_SHIFT
private static final int OFFSET_SHIFTThe shift to apply on a compressed offset (after application ofOFFSET_MASK) for getting the uncompressed offset.- See Also:
-
OFFSET_LIMIT
private static final int OFFSET_LIMITMaximal value + 1 that the offset can take. The compressed offset takes all the bits after the length, minus one bit that we keep for thePREALLOCATED_SPACE_IS_USED_MASKflag. Note that compressed offsets are multiplied by 1 << STRING_ALIGNMENT for getting the actual offset.- See Also:
-
LENGTH_MASK_FOR_ALLOCATE
private static final int LENGTH_MASK_FOR_ALLOCATEA mask used for detecting when a new allocation is required. If(length & LENGTH_MASK_FOR_ALLOCATE) == 0and assuming that length is always incremented by 1, then a new allocation is necessary.- See Also:
-
entriesForCodes
private final int[] entriesForCodesPointers to byte sequences for a code in theentriesForCodesarray. Each element is a value encoded byoffsetAndLength(int, int)method. Elements are decoded byoffset(int)length(int)methods. -
previousCode
private int previousCodeLast code found in previous iteration. This is a valid index in theentriesForCodesarray. AEOI_CODEvalue means that the decompression is finished. -
pendingOffset
private int pendingOffsetIf some bytes could not be written in previousread(…)execution because the target buffer was full, offset and length of those bytes. Otherwise 0. -
pendingLength
private int pendingLengthIf some bytes could not be written in previousread(…)execution because the target buffer was full, offset and length of those bytes. Otherwise 0. -
indexOfFreeEntry
private int indexOfFreeEntryIndex of the next entry available inentriesForCodes. Shall not be lower than 258. -
indexOfFreeString
private int indexOfFreeStringIndex of the next byte available instringsFromCode. Shall not be lower than1 << Byte.SIZE. -
stringsFromCode
private byte[] stringsFromCodeSequences of bytes associated to codes. For a given c code read from the stream, the first uncompressed byte isstringsFromCode(offset(entriesForCodes[c]))and the number of bytes islength(entriesForCodes[c]).
-
-
Constructor Details
-
LZW
Creates a new channel which will decompress data from the given input. ThesetInputRegion(long, long)method must be invoked after construction before a reading process can start.- Parameters:
input- the source of data to decompress.listeners- object where to report warnings.
-
-
Method Details
-
length
private static int length(int element) Extracts the number of bytes of an entry stored in thestringsFromCodearray.- Parameters:
element- an element of theentriesForCodesarray.- Returns:
- number of consecutive bytes to read in
stringsFromCodearray.
-
offset
private static int offset(int element) Extracts the index of the first byte of an entry stored in thestringsFromCodearray.- Parameters:
element- an element of theentriesForCodesarray.- Returns:
- index of the first byte to read in
stringsFromCodearray.
-
offsetAndLength
private static int offsetAndLength(int offset, int length) Encodes an offset together with its length. -
newEntryNeedsAllocation
private static boolean newEntryNeedsAllocation(int element) Returnstrueif all the space allocated for the given entry is already used. This is true if at least one of the following conditions is true:- The
PREALLOCATED_SPACE_IS_USED_MASKis set, in which case value is negative. - All the extra-space allowed by
STRING_ALIGNMENTis used, in which case the lowest bits of the length are all zero.
- Parameters:
element- an element of theentriesForCodesarray.- Returns:
- whether all the space for that entry is already used.
- The
-
setInputRegion
Prepares this inflater for reading a new tile or a new band of a tile.- Overrides:
setInputRegionin classCompressionChannel- Parameters:
start- stream position where to start reading.byteCount- number of bytes to read from the input.- Throws:
IOException- if the stream cannot be seek to the given start position.
-
clearTable
private void clearTable()Clears theentriesForCodestable. -
readNextCode
ReadscodeSizebits from the stream.- Returns:
- the value of the next bits from the stream.
- Throws:
IOException- if an error occurred while reading.
-
read
Decompresses some bytes from the input into the given destination buffer.- Parameters:
target- the buffer into which bytes are to be transferred.- Returns:
- the number of bytes read, or -1 if end-of-stream.
- Throws:
IOException- if some other I/O error occurs.
-
unexpectedData
The exception to throw if the decompression process encounters data that it cannot process.
-