Class RopeByteString
- java.lang.Object
-
- com.google.protobuf.ByteString
-
- com.google.protobuf.RopeByteString
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Iterable<java.lang.Byte>
final class RopeByteString extends ByteString
Class to representByteStringsformed by concatenation of other ByteStrings, without copying the data in the pieces. The concatenation is represented as a tree whose leaf nodes are each aByteString.LeafByteString.Most of the operation here is inspired by the now-famous paper BAP95 Ropes: an Alternative to Strings hans-j. boehm, russ atkinson and michael plass
The algorithms described in the paper have been implemented for character strings in
com.google.common.string.Ropeand in the c++ classcord.cc.Fundamentally the Rope algorithm represents the collection of pieces as a binary tree. BAP95 uses a Fibonacci bound relating depth to a minimum sequence length, sequences that are too short relative to their depth cause a tree rebalance. More precisely, a tree of depth d is "balanced" in the terminology of BAP95 if its length is at least F(d+2), where F(n) is the n-th Fibonacci number. Thus for depths 0, 1, 2, 3, 4, 5,... we have minimum lengths 1, 2, 3, 5, 8, 13,...
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classRopeByteString.BalancerThis class implements the balancing algorithm of BAP95.private static classRopeByteString.PieceIteratorThis class is a continuable tree traversal, which keeps the state information which would exist on the stack in a recursive traversal instead on a stack of "Bread Crumbs".private classRopeByteString.RopeInputStreamThis class is theRopeByteStringequivalent forByteArrayInputStream.-
Nested classes/interfaces inherited from class com.google.protobuf.ByteString
ByteString.AbstractByteIterator, ByteString.ByteIterator, ByteString.CodedBuilder, ByteString.LeafByteString, ByteString.Output
-
-
Field Summary
Fields Modifier and Type Field Description private ByteStringleftprivate intleftLength(package private) static int[]minLengthByDepthBAP95.private ByteStringrightprivate static longserialVersionUIDprivate inttotalLengthprivate inttreeDepth-
Fields inherited from class com.google.protobuf.ByteString
CONCATENATE_BY_COPY_SIZE, EMPTY, MAX_READ_FROM_CHUNK_SIZE, MIN_READ_FROM_CHUNK_SIZE
-
-
Constructor Summary
Constructors Modifier Constructor Description privateRopeByteString(ByteString left, ByteString right)Create a new RopeByteString, which can be thought of as a new tree node, by recording references to the two given strings.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.nio.ByteBufferasReadOnlyByteBuffer()Constructs a read-onlyjava.nio.ByteBufferwhose content is equal to the contents of this byte string.java.util.List<java.nio.ByteBuffer>asReadOnlyByteBufferList()Constructs a list of read-onlyjava.nio.ByteBufferobjects such that the concatenation of their contents is equal to the contents of this byte string.bytebyteAt(int index)Gets the byte at the given index.(package private) static ByteStringconcatenate(ByteString left, ByteString right)Concatenate the given strings while performing various optimizations to slow the growth rate of tree depth and tree node count.private static ByteStringconcatenateBytes(ByteString left, ByteString right)Concatenates two strings by copying data values.voidcopyTo(java.nio.ByteBuffer target)Copies bytes into a ByteBuffer.protected voidcopyToInternal(byte[] target, int sourceOffset, int targetOffset, int numberToCopy)Internal (package private) implementation ofByteString.copyTo(byte[],int,int,int).booleanequals(java.lang.Object other)private booleanequalsFragments(ByteString other)Determines if this string is equal to another of the same length by iterating over the leaf nodes.protected intgetTreeDepth()Return the depth of the tree representing thisByteString, if any, whose root is this node.(package private) byteinternalByteAt(int index)Gets the byte at the given index, assumes bounds checking has already been performed.protected booleanisBalanced()Determines if the tree is balanced according to BAP95, which means the tree is flat-enough with respect to the bounds.booleanisValidUtf8()Tells whether thisByteStringrepresents a well-formed UTF-8 byte sequence, such that the original bytes can be converted to a String object and then round tripped back to bytes without loss.ByteString.ByteIteratoriterator()Return aByteString.ByteIteratorover the bytes in the ByteString.(package private) static intminLength(int depth)Returns the minimum length for which a tree of the given depth is considered balanced according to BAP95, which means the tree is flat-enough with respect to the bounds.CodedInputStreamnewCodedInput()Creates aCodedInputStreamwhich can be used to read the bytes.java.io.InputStreamnewInput()Creates anInputStreamwhich can be used to read the bytes.(package private) static RopeByteStringnewInstanceForTest(ByteString left, ByteString right)Create a new RopeByteString for testing only while bypassing all the defenses ofconcatenate(ByteString, ByteString).protected intpartialHash(int h, int offset, int length)Compute the hash across the value bytes starting with the given hash, and return the result.protected intpartialIsValidUtf8(int state, int offset, int length)Tells whether the given byte sequence is a well-formed, malformed, or incomplete UTF-8 byte sequence.private voidreadObject(java.io.ObjectInputStream in)intsize()Gets the number of bytes.ByteStringsubstring(int beginIndex, int endIndex)Takes a substring of this one.protected java.lang.StringtoStringInternal(java.nio.charset.Charset charset)Constructs a newStringby decoding the bytes using the specified charset.(package private) java.lang.ObjectwriteReplace()(package private) voidwriteTo(ByteOutput output)Writes thisByteStringto the providedByteOutput.voidwriteTo(java.io.OutputStream outputStream)Writes a copy of the contents of this byte string to the specified output stream argument.(package private) voidwriteToInternal(java.io.OutputStream out, int sourceOffset, int numberToWrite)Internal version ofByteString.writeTo(OutputStream,int,int)that assumes all error checking has already been done.(package private) voidwriteToReverse(ByteOutput output)This method behaves exactly the same asByteString.writeTo(ByteOutput)unless theByteStringis a rope.-
Methods inherited from class com.google.protobuf.ByteString
checkIndex, checkRange, concat, copyFrom, copyFrom, copyFrom, copyFrom, copyFrom, copyFrom, copyFrom, copyFromUtf8, copyTo, copyTo, empty, endsWith, fromHex, hashCode, isEmpty, newCodedBuilder, newOutput, newOutput, nioByteString, peekCachedHashCode, readFrom, readFrom, readFrom, startsWith, substring, toByteArray, toString, toString, toString, toStringUtf8, unsignedLexicographicalComparator, wrap, wrap, wrap, writeTo
-
-
-
-
Field Detail
-
minLengthByDepth
static final int[] minLengthByDepth
BAP95. Let Fn be the nth Fibonacci number. ARopeByteStringof depth n is "balanced", i.e flat enough, if its length is at least Fn+2, e.g. a "balanced"RopeByteStringof depth 1 must have length at least 2, of depth 4 must have length >= 8, etc.There's nothing special about using the Fibonacci numbers for this, but they are a reasonable sequence for encapsulating the idea that we are OK with longer strings being encoded in deeper binary trees.
For 32-bit integers, this array has length 46.
The correctness of this constant array is validated in tests.
-
totalLength
private final int totalLength
-
left
private final ByteString left
-
right
private final ByteString right
-
leftLength
private final int leftLength
-
treeDepth
private final int treeDepth
-
serialVersionUID
private static final long serialVersionUID
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
RopeByteString
private RopeByteString(ByteString left, ByteString right)
Create a new RopeByteString, which can be thought of as a new tree node, by recording references to the two given strings.- Parameters:
left- string on the left of this node, should havesize() > 0right- string on the right of this node, should havesize() > 0
-
-
Method Detail
-
concatenate
static ByteString concatenate(ByteString left, ByteString right)
Concatenate the given strings while performing various optimizations to slow the growth rate of tree depth and tree node count. The result is either aByteString.LeafByteStringor aRopeByteStringdepending on which optimizations, if any, were applied.Small pieces of length less than
ByteString.CONCATENATE_BY_COPY_SIZEmay be copied by value here, as in BAP95. Large pieces are referenced without copy.- Parameters:
left- string on the leftright- string on the right- Returns:
- concatenation representing the same sequence as the given strings
-
concatenateBytes
private static ByteString concatenateBytes(ByteString left, ByteString right)
Concatenates two strings by copying data values. This is called in a few cases in order to reduce the growth of the number of tree nodes.- Parameters:
left- string on the leftright- string on the right- Returns:
- string formed by copying data bytes
-
newInstanceForTest
static RopeByteString newInstanceForTest(ByteString left, ByteString right)
Create a new RopeByteString for testing only while bypassing all the defenses ofconcatenate(ByteString, ByteString). This allows testing trees of specific structure. We are also able to insert empty leaves, though these are dis-allowed, so that we can make sure the implementation can withstand their presence.- Parameters:
left- string on the left of this noderight- string on the right of this node- Returns:
- an unsafe instance for testing only
-
minLength
static int minLength(int depth)
Returns the minimum length for which a tree of the given depth is considered balanced according to BAP95, which means the tree is flat-enough with respect to the bounds. Defaults toInteger.MAX_VALUEifdepth >= minLengthByDepth.lengthin order to avoid anArrayIndexOutOfBoundsException.- Parameters:
depth- tree depth- Returns:
- minimum balanced length
-
byteAt
public byte byteAt(int index)
Gets the byte at the given index. ThrowsArrayIndexOutOfBoundsExceptionfor backwards-compatibility reasons although it would more properly beIndexOutOfBoundsException.- Specified by:
byteAtin classByteString- Parameters:
index- index of byte- Returns:
- the value
- Throws:
java.lang.ArrayIndexOutOfBoundsException-indexis < 0 or >= size
-
internalByteAt
byte internalByteAt(int index)
Description copied from class:ByteStringGets the byte at the given index, assumes bounds checking has already been performed.- Specified by:
internalByteAtin classByteString- Parameters:
index- index of byte- Returns:
- the value
-
size
public int size()
Description copied from class:ByteStringGets the number of bytes.- Specified by:
sizein classByteString- Returns:
- size in bytes
-
iterator
public ByteString.ByteIterator iterator()
Description copied from class:ByteStringReturn aByteString.ByteIteratorover the bytes in the ByteString. To avoid auto-boxing, you may get the iterator manually and callByteString.ByteIterator.nextByte().- Specified by:
iteratorin interfacejava.lang.Iterable<java.lang.Byte>- Overrides:
iteratorin classByteString- Returns:
- the iterator
-
getTreeDepth
protected int getTreeDepth()
Description copied from class:ByteStringReturn the depth of the tree representing thisByteString, if any, whose root is this node. If this is a leaf node, return 0.- Specified by:
getTreeDepthin classByteString- Returns:
- tree depth or zero
-
isBalanced
protected boolean isBalanced()
Determines if the tree is balanced according to BAP95, which means the tree is flat-enough with respect to the bounds. Note that this definition of balanced is one where sub-trees of balanced trees are not necessarily balanced.- Specified by:
isBalancedin classByteString- Returns:
- true if the tree is balanced
-
substring
public ByteString substring(int beginIndex, int endIndex)
Takes a substring of this one. This involves recursive descent along the left and right edges of the substring, and referencing any wholly contained segments in between. Any leaf nodes entirely uninvolved in the substring will not be referenced by the substring.Substrings of
length < 2should result in at most a single recursive call chain, terminating at a leaf node. Thus the result will be aByteString.LeafByteString.- Specified by:
substringin classByteString- Parameters:
beginIndex- start at this indexendIndex- the last character is the one before this index- Returns:
- substring leaf node or tree
-
copyToInternal
protected void copyToInternal(byte[] target, int sourceOffset, int targetOffset, int numberToCopy)Description copied from class:ByteStringInternal (package private) implementation ofByteString.copyTo(byte[],int,int,int). It assumes that all error checking has already been performed and thatnumberToCopy > 0.- Specified by:
copyToInternalin classByteString
-
copyTo
public void copyTo(java.nio.ByteBuffer target)
Description copied from class:ByteStringCopies bytes into a ByteBuffer.To copy a subset of bytes, you call this method on the return value of
ByteString.substring(int, int). Example:byteString.substring(start, end).copyTo(target)- Specified by:
copyToin classByteString- Parameters:
target- ByteBuffer to copy into.
-
asReadOnlyByteBuffer
public java.nio.ByteBuffer asReadOnlyByteBuffer()
Description copied from class:ByteStringConstructs a read-onlyjava.nio.ByteBufferwhose content is equal to the contents of this byte string. The result uses the same backing array as the byte string, if possible.- Specified by:
asReadOnlyByteBufferin classByteString- Returns:
- wrapped bytes
-
asReadOnlyByteBufferList
public java.util.List<java.nio.ByteBuffer> asReadOnlyByteBufferList()
Description copied from class:ByteStringConstructs a list of read-onlyjava.nio.ByteBufferobjects such that the concatenation of their contents is equal to the contents of this byte string. The result uses the same backing arrays as the byte string.By returning a list, implementations of this method may be able to avoid copying even when there are multiple backing arrays.
- Specified by:
asReadOnlyByteBufferListin classByteString- Returns:
- a list of wrapped bytes
-
writeTo
public void writeTo(java.io.OutputStream outputStream) throws java.io.IOExceptionDescription copied from class:ByteStringWrites a copy of the contents of this byte string to the specified output stream argument.- Specified by:
writeToin classByteString- Parameters:
outputStream- the output stream to which to write the data.- Throws:
java.io.IOException- if an I/O error occurs.
-
writeToInternal
void writeToInternal(java.io.OutputStream out, int sourceOffset, int numberToWrite) throws java.io.IOExceptionDescription copied from class:ByteStringInternal version ofByteString.writeTo(OutputStream,int,int)that assumes all error checking has already been done.- Specified by:
writeToInternalin classByteString- Throws:
java.io.IOException
-
writeTo
void writeTo(ByteOutput output) throws java.io.IOException
Description copied from class:ByteStringWrites thisByteStringto the providedByteOutput. Calling this method may result in multiple operations on the targetByteOutput.This method may expose internal backing buffers of the
ByteStringto theByteOutputin order to avoid additional copying overhead. It would be possible for a maliciousByteOutputto corrupt theByteString. Use with caution!- Specified by:
writeToin classByteString- Parameters:
output- the output target to receive the bytes- Throws:
java.io.IOException- if an I/O error occurs- See Also:
UnsafeByteOperations.unsafeWriteTo(ByteString, ByteOutput)
-
writeToReverse
void writeToReverse(ByteOutput output) throws java.io.IOException
Description copied from class:ByteStringThis method behaves exactly the same asByteString.writeTo(ByteOutput)unless theByteStringis a rope. For ropes, the leaf nodes are written in reverse order to thebyteOutput.- Specified by:
writeToReversein classByteString- Parameters:
output- the output target to receive the bytes- Throws:
java.io.IOException- if an I/O error occurs- See Also:
UnsafeByteOperations#unsafeWriteToReverse(ByteString, ByteOutput)
-
toStringInternal
protected java.lang.String toStringInternal(java.nio.charset.Charset charset)
Description copied from class:ByteStringConstructs a newStringby decoding the bytes using the specified charset.- Specified by:
toStringInternalin classByteString- Parameters:
charset- encode using this charset- Returns:
- new string
-
isValidUtf8
public boolean isValidUtf8()
Description copied from class:ByteStringTells whether thisByteStringrepresents a well-formed UTF-8 byte sequence, such that the original bytes can be converted to a String object and then round tripped back to bytes without loss.More precisely, returns
truewhenever:Arrays.equals(byteString.toByteArray(), new String(byteString.toByteArray(), "UTF-8").getBytes("UTF-8"))This method returns
falsefor "overlong" byte sequences, as well as for 3-byte sequences that would map to a surrogate character, in accordance with the restricted definition of UTF-8 introduced in Unicode 3.1. Note that the UTF-8 decoder included in Oracle's JDK has been modified to also reject "overlong" byte sequences, but (as of 2011) still accepts 3-byte surrogate character byte sequences.See the Unicode Standard,
Table 3-6. UTF-8 Bit Distribution,
Table 3-7. Well Formed UTF-8 Byte Sequences.- Specified by:
isValidUtf8in classByteString- Returns:
- whether the bytes in this
ByteStringare a well-formed UTF-8 byte sequence
-
partialIsValidUtf8
protected int partialIsValidUtf8(int state, int offset, int length)Description copied from class:ByteStringTells whether the given byte sequence is a well-formed, malformed, or incomplete UTF-8 byte sequence. This method accepts and returns a partial state result, allowing the bytes for a complete UTF-8 byte sequence to be composed from multipleByteStringsegments.- Specified by:
partialIsValidUtf8in classByteString- Parameters:
state- either0(if this is the initial decoding operation) or the value returned from a call to a partial decoding method for the previous bytesoffset- offset of the first byte to checklength- number of bytes to check- Returns:
-1if the partial byte sequence is definitely malformed,0if it is well-formed (no additional input needed), or, if the byte sequence is "incomplete", i.e. apparently terminated in the middle of a character, an opaque integer "state" value containing enough information to decode the character when passed to a subsequent invocation of a partial decoding method.
-
equals
public boolean equals(java.lang.Object other)
- Specified by:
equalsin classByteString
-
equalsFragments
private boolean equalsFragments(ByteString other)
Determines if this string is equal to another of the same length by iterating over the leaf nodes. On each step of the iteration, the overlapping segments of the leaves are compared.- Parameters:
other- string of the same length as this one- Returns:
- true if the values of this string equals the value of the given one
-
partialHash
protected int partialHash(int h, int offset, int length)Description copied from class:ByteStringCompute the hash across the value bytes starting with the given hash, and return the result. This is used to compute the hash across strings represented as a set of pieces by allowing the hash computation to be continued from piece to piece.- Specified by:
partialHashin classByteString- Parameters:
h- starting hash valueoffset- offset into this value to start looking at data valueslength- number of data values to include in the hash computation- Returns:
- ending hash value
-
newCodedInput
public CodedInputStream newCodedInput()
Description copied from class:ByteStringCreates aCodedInputStreamwhich can be used to read the bytes. Using this is often more efficient than creating aCodedInputStreamthat wraps the result ofByteString.newInput().- Specified by:
newCodedInputin classByteString- Returns:
- stream based on wrapped data
-
newInput
public java.io.InputStream newInput()
Description copied from class:ByteStringCreates anInputStreamwhich can be used to read the bytes.The
InputStreamreturned by this method is guaranteed to be completely non-blocking. The methodInputStream.available()returns the number of bytes remaining in the stream. The methodsInputStream.read(byte[]),InputStream.read(byte[],int,int)andInputStream.skip(long)will read/skip as many bytes as are available. The methodInputStream.markSupported()returnstrue.The methods in the returned
InputStreammight not be thread safe.- Specified by:
newInputin classByteString- Returns:
- an input stream that returns the bytes of this byte string.
-
writeReplace
java.lang.Object writeReplace()
-
readObject
private void readObject(java.io.ObjectInputStream in) throws java.io.IOException- Throws:
java.io.IOException
-
-