Class ImmutableExternalPrefixMap
- All Implemented Interfaces:
PrefixMap<MutableString>, StringMap<MutableString>, it.unimi.dsi.fastutil.Function<CharSequence, Long>, it.unimi.dsi.fastutil.objects.Object2LongFunction<CharSequence>, it.unimi.dsi.fastutil.Size64, Serializable, Function<CharSequence, Long>, ToLongFunction<CharSequence>
- Since:
- 2.0
- Author:
- Sebastiano Vigna
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final long[][]A big array array parallel toblockStartgiving the offset in blocks in the dump file of the corresponding word inblockStart.protected final longThe block size of this (in bits).protected final long[][]The index of the first word in each block, plus an additional entry containingFunction.size().protected final it.unimi.dsi.fastutil.chars.Char2IntOpenHashMapA map from characters to symbols of the coder.protected final DecoderA decoder used to read data from the dump stream.protected InputBitStreamA reference to the dump stream.protected final ImmutableBinaryTrie<CharSequence> The in-memory data structure used to approximate intervals..protected booleanIf true, the creation of the lastDumpStreamIteratorwas not followed by a call to any get method.protected final booleanWhether this map is self-contained.static final longprotected final longThe number of terms in this map.static final intThe standard block size (in bytes).protected final char[]A map (given by an array) from symbols in the coder to characters.Fields inherited from class AbstractPrefixMap
list, prefixMap, rangeMapFields inherited from class it.unimi.dsi.fastutil.objects.AbstractObject2LongFunction
defRetValue -
Constructor Summary
ConstructorsConstructorDescriptionImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms) Creates an external prefix map with block sizeSTD_BLOCK_SIZE.ImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms, int blockSizeInBytes) Creates an external prefix map with specified block size.ImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms, int blockSizeInBytes, CharSequence dumpStreamFilename) Creates an external prefix map with specified block size and dump stream.ImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms, CharSequence dumpStreamFilename) Creates an external prefix map with block sizeSTD_BLOCK_SIZEand specified dump stream. -
Method Summary
Modifier and TypeMethodDescriptionbooleancontainsKey(Object term) getInterval(CharSequence prefix) Returns the range of strings having a given prefix.longprotected MutableStringgetTerm(long index, MutableString s) Writes a string specified by index into aMutableString.it.unimi.dsi.fastutil.objects.ObjectIterator<CharSequence> iterator()Returns an iterator over the map.static voidvoidsetDumpStream(InputBitStream dumpStream) Sets the dump stream of this external prefix map to a given input bit stream.voidsetDumpStream(CharSequence dumpStreamFilename) Sets the dump stream of this external prefix map to a given filename.longsize64()Returns the intended number of keys in this function, or -1 if no such number exists.Methods inherited from class AbstractPrefixMap
list, prefixMap, rangeMapMethods inherited from class it.unimi.dsi.fastutil.objects.AbstractObject2LongFunction
defaultReturnValue, defaultReturnValueMethods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface it.unimi.dsi.fastutil.Function
apply, clearMethods inherited from interface it.unimi.dsi.fastutil.objects.Object2LongFunction
andThen, andThenByte, andThenChar, andThenDouble, andThenFloat, andThenInt, andThenLong, andThenObject, andThenReference, andThenShort, applyAsLong, composeByte, composeChar, composeDouble, composeFloat, composeInt, composeLong, composeObject, composeReference, composeShort, defaultReturnValue, defaultReturnValue, get, getOrDefault, getOrDefault, put, put, remove, removeLongMethods inherited from interface it.unimi.dsi.fastutil.Size64
size
-
Field Details
-
serialVersionUID
public static final long serialVersionUID- See Also:
-
STD_BLOCK_SIZE
public static final int STD_BLOCK_SIZEThe standard block size (in bytes).- See Also:
-
intervalApproximator
The in-memory data structure used to approximate intervals.. -
blockSize
protected final long blockSizeThe block size of this (in bits). -
decoder
A decoder used to read data from the dump stream. -
symbol2char
protected final char[] symbol2charA map (given by an array) from symbols in the coder to characters. -
char2symbol
protected final it.unimi.dsi.fastutil.chars.Char2IntOpenHashMap char2symbolA map from characters to symbols of the coder. -
size
protected final long sizeThe number of terms in this map. -
blockStart
protected final long[][] blockStartThe index of the first word in each block, plus an additional entry containingFunction.size(). -
blockOffset
protected final long[][] blockOffsetA big array array parallel toblockStartgiving the offset in blocks in the dump file of the corresponding word inblockStart. If there are no overflows, this will just be an initial segment of the natural numbers, but overflows cause jumps. -
selfContained
protected final boolean selfContainedWhether this map is self-contained. -
iteratorIsUsable
protected transient boolean iteratorIsUsableIf true, the creation of the lastDumpStreamIteratorwas not followed by a call to any get method. -
dumpStream
A reference to the dump stream.
-
-
Constructor Details
-
ImmutableExternalPrefixMap
public ImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms, int blockSizeInBytes, CharSequence dumpStreamFilename) throws IOException Creates an external prefix map with specified block size and dump stream.This constructor does not assume that
CharSequenceinstances returned byterms.iterator()will be distinct. Thus, it can be safely used withFileLinesMutableStringIterable.- Parameters:
terms- an iterable whose iterator will enumerate in lexicographical order the terms for the map.blockSizeInBytes- the block size (in bytes).dumpStreamFilename- the name of the dump stream, ornullfor a self-contained map.- Throws:
IOException
-
ImmutableExternalPrefixMap
public ImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms, CharSequence dumpStreamFilename) throws IOException Creates an external prefix map with block sizeSTD_BLOCK_SIZEand specified dump stream.This constructor does not assume that
CharSequenceinstances returned byterms.iterator()will be distinct. Thus, it can be safely used withFileLinesMutableStringIterable.- Parameters:
terms- a collection whose iterator will enumerate in lexicographical order the terms for the map.dumpStreamFilename- the name of the dump stream, ornullfor a self-contained map.- Throws:
IOException
-
ImmutableExternalPrefixMap
public ImmutableExternalPrefixMap(Iterable<? extends CharSequence> terms, int blockSizeInBytes) throws IOException Creates an external prefix map with specified block size.This constructor does not assume that
CharSequenceinstances returned byterms.iterator()will be distinct. Thus, it can be safely used withFileLinesMutableStringIterable.- Parameters:
terms- a collection whose iterator will enumerate in lexicographical order the terms for the map.blockSizeInBytes- the block size (in bytes).- Throws:
IOException
-
ImmutableExternalPrefixMap
Creates an external prefix map with block sizeSTD_BLOCK_SIZE.This constructor does not assume that strings returned by
terms.iterator()will be distinct. Thus, it can be safely used withFileLinesMutableStringIterable.- Parameters:
terms- a collection whose iterator will enumerate in lexicographical order the terms for the map.- Throws:
IOException
-
-
Method Details
-
setDumpStream
Sets the dump stream of this external prefix map to a given filename.This method sets the dump file used by this map, and should be only called after deserialisation, providing exactly the file generated at creation time. Essentially anything can happen if you do not follow the rules.
Note that this method will attempt to close the old stream, if present.
- Parameters:
dumpStreamFilename- the name of the dump file.- Throws:
FileNotFoundException- See Also:
-
setDumpStream
Sets the dump stream of this external prefix map to a given input bit stream.This method sets the dump file used by this map, and should be only called after deserialisation, providing a repositionable stream containing exactly the file generated at creation time. Essentially anything can happen if you do not follow the rules.
Using this method you can load an external prefix map in core memory, enjoying the compactness of the data structure, but getting much more speed.
Note that this method will attemp to close the old stream, if present.
- Parameters:
dumpStream- a repositionable input bit stream containing exactly the dump stream generated at creation time.- See Also:
-
getInterval
Description copied from class:AbstractPrefixMapReturns the range of strings having a given prefix.- Specified by:
getIntervalin classAbstractPrefixMap- Parameters:
prefix- a prefix.- Returns:
- the corresponding range of strings as an interval.
-
getTerm
Description copied from class:AbstractPrefixMapWrites a string specified by index into aMutableString.- Specified by:
getTermin classAbstractPrefixMap- Parameters:
index- the index of a string.s- a mutable string.- Returns:
string.
-
containsKey
- Specified by:
containsKeyin interfaceit.unimi.dsi.fastutil.Function<CharSequence, Long>
-
getLong
- Specified by:
getLongin interfaceit.unimi.dsi.fastutil.objects.Object2LongFunction<CharSequence>
-
iterator
Returns an iterator over the map.The iterator returned by this method scans directly the dump stream.
Note that the returned iterator uses the same stream as all get methods. Calling such methods while the iterator is being used will produce an
IllegalStateException.- Returns:
- an iterator over the map that just scans the dump stream.
-
size64
public long size64()Description copied from interface:StringMapReturns the intended number of keys in this function, or -1 if no such number exists.Most function implementations will have some knowledge of the intended number of keys in their domain. In some cases, however, this might not be possible. This default implementation, in particular, returns -1.
- Specified by:
size64in interfaceit.unimi.dsi.fastutil.Size64- Specified by:
size64in interfaceStringMap<MutableString>- Returns:
- the intended number of keys in this function, or -1 if that number is not available.
-
main
public static void main(String[] arg) throws ClassNotFoundException, IOException, com.martiansoftware.jsap.JSAPException, SecurityException, NoSuchMethodException - Throws:
ClassNotFoundExceptionIOExceptioncom.martiansoftware.jsap.JSAPExceptionSecurityExceptionNoSuchMethodException
-