Class ImmutableGraph
- java.lang.Object
-
- it.unimi.dsi.big.webgraph.ImmutableGraph
-
- All Implemented Interfaces:
it.unimi.dsi.lang.FlyweightPrototype<ImmutableGraph>
- Direct Known Subclasses:
ArcLabelledImmutableGraph,BidirectionalImmutableGraph,BVGraph,EFGraph,ImmutableSequentialGraph,ImmutableSubgraph,UnionImmutableGraph
public abstract class ImmutableGraph extends java.lang.Object implements it.unimi.dsi.lang.FlyweightPrototype<ImmutableGraph>
A simple abstract class representing an immutable graph.Subclasses of this class are used to create and access immutable graphs, that is, graphs that are computed once for all, stored conveniently, and then accessed repeatedly. Moreover, immutable graphs are usually very large—so large that two such graphs may not fit into central memory (the main example being a sizable portion of the web).
A subclass of this class must implement methods to obtain the number of nodes, the outdegree of a node and the successors of a node (either
successors(long)orsuccessorBigArray(long)). Additionally, it may provide methods to obtain the number of arcs, and a basename.This class provides
equals(Object)andhashCode()methods that consider two graph equals if they have the same size and all their successor lists are equal.Iterating on successors
Starting with WebGraph 2.0, the iterator architecture is fully lazy—you have no
hasNext()method. Rather, theLazyLongIteratorreturned bysuccessors(long)will return -1 when no more successors are available. The idiomatic forms for enumerating successors via iterators areLazyLongIterator successors = g.successors(x); int d = g.outdegree(x); while (d-- != 0) doSomething(successors.nextInt());
andLazyLongIterator successors = g.successors(x); int t; while ((t = successors.nextInt()) != -1) doSomething(t);
The alternative method
successorBigArray(long)provides an array containing the successors and possibly more elements. Useoutdegree(long)to know how many elements are valid. The efficiency ofsuccessors(long)andsuccessorBigArray(long)may vary depending on the implementation.Iterating on a graph in parallel
You can scan a graph sequentially using node iterators. Starting with version 3.6.0, implementations of this class may return true on
hasCopiableIterators(), which means that node iterators implement the optionalcopy(long)method. Usingcopy(long), the methodsplitNodeIterators(int)of this class is able to provide separate, thread-safe iterators on different segments of contiguous nodes of the graph. The classBVGraph, for example, uses this interface to provide parallel compression. We suggest that all classes providing parallel iteration read the system variable "it.unimi.dsi.webgraph.threads" to override the number of parallel threads.Building an immutable graph
Due to their large size, immutable graphs have a peculiar serialisation scheme. Every subclass of this class must implement a number of static methods that create an immutable graph, given a string (usually a basename for a set of files) and, optionally, a
ProgressLogger. The signatures that must be implemented areImmutableGraph load(CharSequence, ProgressLogger);ImmutableGraph load(CharSequence);ImmutableGraph loadOffline(CharSequence, ProgressLogger);ImmutableGraph loadOffline(CharSequence).ImmutableGraph loadOnce(InputStream);
Additionally, the following signatures can be implemented:
ImmutableGraph loadMapped(CharSequence, ProgressLogger);ImmutableGraph loadMapped(CharSequence);
The special semantics associated to
loadOffline()is that the immutable graph should be set up, and possibly some metadata could be read from disk, but no actual data is loaded into memory; the class should guarantee that offline sequential access (i.e., by means ofnodeIterator(long)) is still possible. In other words, in most casesnodeIterator(long)will have to be overridden by the subclasses to behave properly even in an offline setting (seenodeIterator()). The special semantics associated withloadOnce()is that the graph can be traversed just once using a call tonodeIterator(). The special semantics associated withloadMapped()is that metadata could be read from disk, but the graph will be accessed by memory mapping; the class should guarantee that random access is possible.Note that a simple class may just implement all special forms of graph loading delegating to the standard load method (see, e.g.,
ASCIIGraph). Specific implementations ofImmutableGraphmay also decide to expose internal load methods to make it easier to write load methods for subclasses (see, e.g.,loadInternal()).Analogously, a subclass of this class may also implement
store(ImmutableGraph, CharSequence, ProgressLogger);store(ImmutableGraph, CharSequence).
storemethods are available, as parameters vary wildly from subclass to subclass. The methodstore(Class, ImmutableGraph, CharSequence, ProgressLogger)invokes by reflection the methods above on the provided class.The standard method to build a new immutable graph is creating a (possibly anonymous) class that extends this class, and save it using a concrete subclass (e.g.,
BVGraph). See the source ofTransformfor several examples.Properties Conventions
To provide a simple way to load an immutable graph without knowing in advance its class, the following convention may be followed: a graph with basename
namemay feature a Java property filename.propertieswith a propertygraphclasscontaining the actual class of the graph. In this case, you can use the implementation of the load/store methods contained in this class, similarly to the standard Java serialisation scheme.BVGraph, for instance, follows this convention, butASCIIGraphdoes not.The reason why this convention is not enforced is that it is sometimes useful to write lightweight classes, mostly for debugging purposes, whose graph representation is entirely contained in a single file (e.g.,
ASCIIGraph), so thatloadOnce(InputStream)can be easily implemented.Facilities for loading an immutable graph
ImmutableGraphprovides ready-made implementations of the load methods that work as follows: they opens a property file with the given basename, and look for thegraphclassproperty; then, they simply delegates the actual load to the specified graph class by reflection.Thread-safety and flyweight copies
Implementations of this class need not be thread-safe. However, they implement the
FlyweightPrototypepattern: thecopy()method is thread-safe and will return a lightweight copy of the graph—usually, all immutable data will be shared between copies. Concurrent access to different copies is safe.Note that by contract
copy()is guaranteed to work only ifrandomAccess()returns true.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classImmutableGraph.LoadMethodA list of the methods that can be used to load a graph.
-
Field Summary
Fields Modifier and Type Field Description static java.lang.StringGRAPHCLASS_PROPERTY_KEYstatic java.lang.StringNUMBER_OF_THREADS_PROPERTYThe property used to set the number of parallel compression threads.static java.lang.StringPROPERTIES_EXTENSIONThe standard extension of property files.
-
Constructor Summary
Constructors Constructor Description ImmutableGraph()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description java.lang.CharSequencebasename()Returns a symbolic basename for this graph (optional operation).abstract ImmutableGraphcopy()Returns a flyweight copy of this immutable graph.booleanequals(java.lang.Object o)Compare this immutable graph to another object.booleanhasCopiableIterators()Whether the node iterators returned by this graph supportNodeIterator.copy(long).inthashCode()Returns a hash code for this immutable graph.intintNumNodes()A method returning the number of nodes as an integer, for easier backward compatibility.protected static ImmutableGraphload(ImmutableGraph.LoadMethod method, java.lang.CharSequence basename, java.io.InputStream is, it.unimi.dsi.logging.ProgressLogger pl)Creates a new immutable graph by loading a graph file from disk to memory, delegating the actual loading to the class specified in thegraphclassproperty within the property file (namedbasename.properties).static ImmutableGraphload(java.lang.CharSequence basename)Creates a newImmutableGraphby loading a graph file from disk to memory, with all offsets, using no progress logger.static ImmutableGraphload(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl)Creates a newImmutableGraphby loading a graph file from disk to memory, with all offsets, using a progress logger.static ImmutableGraphloadMapped(java.lang.CharSequence basename)Creates a newImmutableGraphby memory-mapping a graph file.static ImmutableGraphloadMapped(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl)Creates a newImmutableGraphby memory-mapping a graph file.static ImmutableGraphloadOffline(java.lang.CharSequence basename)Creates a newImmutableGraphby loading offline a graph file.static ImmutableGraphloadOffline(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl)Creates a newImmutableGraphby loading offline a graph file.static ImmutableGraphloadOnce(java.io.InputStream is)Creates a newImmutableGraphby loading a read-once graph from an input stream.static ImmutableGraphloadSequential(java.lang.CharSequence basename)Deprecated.UseloadOffline(CharSequence)orloadMapped(CharSequence)instead.static ImmutableGraphloadSequential(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl)Deprecated.NodeIteratornodeIterator()Returns a node iterator for scanning the graph sequentially, starting from the first node.NodeIteratornodeIterator(long from)Returns a node iterator for scanning the graph sequentially, starting from the given node.longnumArcs()Returns the number of arcs of this graph (optional operation).abstract longnumNodes()Returns the number of nodes of this graph.abstract longoutdegree(long x)Returns the outdegree of a node.it.unimi.dsi.fastutil.longs.LongIteratoroutdegrees()Returns an iterator enumerating the outdegrees of the nodes of this graph.abstract booleanrandomAccess()Checks whether this graph provides random access to successor lists.NodeIterator[]splitNodeIterators(int howMany)Returns an array of node iterators, scanning each a portion of the nodes of a graph.static voidstore(java.lang.Class<?> graphClass, ImmutableGraph graph, java.lang.CharSequence basename)Stores an immutable graph using a specified subclass.static voidstore(java.lang.Class<?> graphClass, ImmutableGraph graph, java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl)Stores an immutable graph using a specified subclass and a progress logger.long[][]successorBigArray(long x)Returns a reference to a big array containing the successors of a given node.LazyLongIteratorsuccessors(long x)Returns a lazy iterator over the successors of a given node.java.lang.StringtoString()static it.unimi.dsi.webgraph.ImmutableGraphwrap(ImmutableGraph graph)static ImmutableGraphwrap(it.unimi.dsi.webgraph.ImmutableGraph graph)
-
-
-
Field Detail
-
GRAPHCLASS_PROPERTY_KEY
public static final java.lang.String GRAPHCLASS_PROPERTY_KEY
- See Also:
- Constant Field Values
-
PROPERTIES_EXTENSION
public static final java.lang.String PROPERTIES_EXTENSION
The standard extension of property files.- See Also:
- Constant Field Values
-
NUMBER_OF_THREADS_PROPERTY
public static final java.lang.String NUMBER_OF_THREADS_PROPERTY
The property used to set the number of parallel compression threads.- See Also:
- Constant Field Values
-
-
Method Detail
-
numNodes
public abstract long numNodes()
Returns the number of nodes of this graph.Albeit this method is not optional, it is allowed that this method throws an
UnsupportedOperationExceptionif this graph has never been entirely traversed using anode iterator. This apparently bizarre behaviour is necessary to support implementations asArcListASCIIGraph, which do not know the actual number of nodes until a traversal has been completed.- Returns:
- the number of nodes.
-
intNumNodes
public int intNumNodes()
A method returning the number of nodes as an integer, for easier backward compatibility.- Returns:
numNodes(), if it is smaller thanInteger.MAX_VALUE; otherwise, an exception will be thrown.- Throws:
java.lang.IllegalStateException- ifnumNodes()is larger thanInteger.MAX_VALUE.
-
numArcs
public long numArcs()
Returns the number of arcs of this graph (optional operation).- Returns:
- the number of arcs.
-
randomAccess
public abstract boolean randomAccess()
Checks whether this graph provides random access to successor lists.- Returns:
- true if this graph provides random access to successor lists.
-
hasCopiableIterators
public boolean hasCopiableIterators()
Whether the node iterators returned by this graph supportNodeIterator.copy(long).- Returns:
- true if this graph provides copiable iterators.
- Implementation Specification:
- This implementation just returns
randomAccess().
-
basename
public java.lang.CharSequence basename()
Returns a symbolic basename for this graph (optional operation).Implementors of this class may provide a basename (usually a pathname from which various files storing the graph are stemmed). This method is optional because it is sometimes unmeaningful (e.g., for one-off anonymous classes).
- Returns:
- the basename.
-
successors
public LazyLongIterator successors(long x)
Returns a lazy iterator over the successors of a given node. The iteration terminates when -1 is returned.- Parameters:
x- a node.- Returns:
- a lazy iterator over the successors of the node.
- Implementation Specification:
- This implementation just wraps the array returned by
successorBigArray(long). Subclasses are encouraged to override this implementation. - Implementation Notes:
- The semantics of this method has been significantly modified in WebGraph 2.0 to take advantage of the new, faster lazy architecture.
-
successorBigArray
public long[][] successorBigArray(long x)
Returns a reference to a big array containing the successors of a given node.The returned big array may contain more entries than the outdegree of
x. However, only those with indices from 0 (inclusive) to the outdegree ofx(exclusive) contain valid data.- Parameters:
x- a node.- Returns:
- a big array whose first elements are the successors of the node; the array must not be modified by the caller.
- Implementation Specification:
- This implementation just unwraps the iterator returned by
successors(long). Subclasses are encouraged to override this implementation.
-
outdegree
public abstract long outdegree(long x)
Returns the outdegree of a node.- Parameters:
x- a node.- Returns:
- the outdegree of the given node.
- Throws:
java.lang.IllegalStateException- if called without offsets.
-
nodeIterator
public NodeIterator nodeIterator(long from)
Returns a node iterator for scanning the graph sequentially, starting from the given node.- Parameters:
from- the node from which the iterator will iterate.- Returns:
- a
NodeIteratorfor accessing nodes and successors sequentially. - Implementation Specification:
- This implementation just calls the random-access methods (
successors(long)andoutdegree(long)). More specific implementations may choose to maintain some extra state to make the enumeration more efficient.
-
nodeIterator
public NodeIterator nodeIterator()
Returns a node iterator for scanning the graph sequentially, starting from the first node.- Returns:
- a
NodeIteratorfor accessing nodes and successors sequentially.
-
splitNodeIterators
public NodeIterator[] splitNodeIterators(int howMany)
Returns an array of node iterators, scanning each a portion of the nodes of a graph. Iterators are guaranteed to scan mutually disjoint sets of nodes, and every node is guaranteed to be scanned by one iterator.This is an optional operation. If implemented, though, the returned iterators must properly implement
NodeIterator.copy(long).- Parameters:
howMany- the number of iterators to be returned (at the end of the array, some of them may be empty).- Returns:
- the required iterators.
-
copy
public abstract ImmutableGraph copy()
Returns a flyweight copy of this immutable graph.- Specified by:
copyin interfaceit.unimi.dsi.lang.FlyweightPrototype<ImmutableGraph>- Returns:
- a flyweight copy of this immutable graph.
- Throws:
java.lang.UnsupportedOperationException- if flyweight copies are not supported: support is guaranteed only ifrandomAccess()returns true.- See Also:
FlyweightPrototype
-
outdegrees
public it.unimi.dsi.fastutil.longs.LongIterator outdegrees()
Returns an iterator enumerating the outdegrees of the nodes of this graph.- Returns:
- an iterator enumerating the outdegrees of the nodes of this graph.
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
loadSequential
@Deprecated public static ImmutableGraph loadSequential(java.lang.CharSequence basename) throws java.io.IOException
Deprecated.UseloadOffline(CharSequence)orloadMapped(CharSequence)instead.Creates a newImmutableGraphby loading a graph file from disk to memory, without offsets.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
loadSequential
@Deprecated public static ImmutableGraph loadSequential(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl) throws java.io.IOException
Deprecated.Creates a newImmutableGraphby loading a graph file from disk to memory, without offsets.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.pl- a progress logger used while loading the graph, ornull.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
loadOffline
public static ImmutableGraph loadOffline(java.lang.CharSequence basename) throws java.io.IOException
Creates a newImmutableGraphby loading offline a graph file.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
loadOffline
public static ImmutableGraph loadOffline(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl) throws java.io.IOException
Creates a newImmutableGraphby loading offline a graph file.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.pl- a progress logger used while loading the graph, ornull.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
loadMapped
public static ImmutableGraph loadMapped(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl) throws java.io.IOException
Creates a newImmutableGraphby memory-mapping a graph file.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.pl- a progress logger used while loading the offsets, ornull.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while memory-mapping the graph or reading the offsets.
-
loadMapped
public static ImmutableGraph loadMapped(java.lang.CharSequence basename) throws java.io.IOException
Creates a newImmutableGraphby memory-mapping a graph file.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while memory-mapping the graph or reading the offsets.
-
loadOnce
public static ImmutableGraph loadOnce(java.io.InputStream is) throws java.io.IOException
Creates a newImmutableGraphby loading a read-once graph from an input stream.- Parameters:
is- an input stream containing the graph.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.java.lang.UnsupportedOperationException- if this graph class does not support read-once graphs.- Implementation Specification:
- This implementation just throws a
UnsupportedOperationException. There is no way to write a generic implementation, because there is no way to know in advance the class that should read the graph.
-
load
public static ImmutableGraph load(java.lang.CharSequence basename) throws java.io.IOException
Creates a newImmutableGraphby loading a graph file from disk to memory, with all offsets, using no progress logger.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
load
public static ImmutableGraph load(java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl) throws java.io.IOException
Creates a newImmutableGraphby loading a graph file from disk to memory, with all offsets, using a progress logger.This method uses the properties convention described in the introduction.
- Parameters:
basename- the basename of the graph.pl- a progress logger used while loading the graph, ornull.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
load
protected static ImmutableGraph load(ImmutableGraph.LoadMethod method, java.lang.CharSequence basename, java.io.InputStream is, it.unimi.dsi.logging.ProgressLogger pl) throws java.io.IOException
Creates a new immutable graph by loading a graph file from disk to memory, delegating the actual loading to the class specified in thegraphclassproperty within the property file (namedbasename.properties). The exact load method to be used depends on themethodargument.This method uses the properties convention described in the introduction.
- Parameters:
method- the method to be used to load the graph.basename- the basename of the graph, ifmethodis notImmutableGraph.LoadMethod.ONCE.is- an input stream the containing the graph, ifmethodisImmutableGraph.LoadMethod.ONCE.pl- the progress logger; it can benull.- Returns:
- an
ImmutableGraphcontaining the specified graph. - Throws:
java.io.IOException- if an I/O exception occurs while reading the graph.
-
store
public static void store(java.lang.Class<?> graphClass, ImmutableGraph graph, java.lang.CharSequence basename, it.unimi.dsi.logging.ProgressLogger pl) throws java.io.IOExceptionStores an immutable graph using a specified subclass and a progress logger.This method is a useful shorthand that invoke by reflection the store method of a given subclass. Note, however, that usually a subclass will provide more refined store methods with more parameters.
- Parameters:
graphClass- the subclass ofImmutableGraphthat should store the graph.graph- the graph to store.basename- the basename.pl- a progress logger, ornull.- Throws:
java.io.IOException
-
store
public static void store(java.lang.Class<?> graphClass, ImmutableGraph graph, java.lang.CharSequence basename) throws java.io.IOExceptionStores an immutable graph using a specified subclass.- Parameters:
graphClass- the subclass ofImmutableGraphthat should store the graph.graph- the graph to store.basename- the basename.- Throws:
java.io.IOException- See Also:
store(Class, ImmutableGraph, CharSequence, ProgressLogger)
-
equals
public boolean equals(java.lang.Object o)
Compare this immutable graph to another object.- Overrides:
equalsin classjava.lang.Object- Returns:
- true iff the given object is an immutable graph of the same size, and
the successor list of every node of this graph is equal to the successor list of the corresponding node of
o.
-
hashCode
public int hashCode()
Returns a hash code for this immutable graph.- Overrides:
hashCodein classjava.lang.Object- Returns:
- a hash code for this immutable graph.
-
wrap
public static ImmutableGraph wrap(it.unimi.dsi.webgraph.ImmutableGraph graph)
-
wrap
public static it.unimi.dsi.webgraph.ImmutableGraph wrap(ImmutableGraph graph)
-
-