Class ZipfDistribution
java.lang.Object
org.apache.commons.math3.distribution.AbstractIntegerDistribution
org.apache.commons.math3.distribution.ZipfDistribution
- All Implemented Interfaces:
Serializable, IntegerDistribution
Implementation of the Zipf distribution.
Parameters:
For a random variable X whose values are distributed according to this
distribution, the probability mass function is given by
P(X = k) = H(N,s) * 1 / k^s for k = 1,2,...,N.
H(N,s) is the normalizing constant
which corresponds to the generalized harmonic number of order N of s.
Nis the number of elementssis the exponent
- See Also:
-
Field Summary
Fields inherited from class AbstractIntegerDistribution
random, randomData -
Constructor Summary
ConstructorsConstructorDescriptionZipfDistribution(int numberOfElements, double exponent) Create a new Zipf distribution with the given number of elements and exponent.ZipfDistribution(RandomGenerator rng, int numberOfElements, double exponent) Creates a Zipf distribution. -
Method Summary
Modifier and TypeMethodDescriptionprotected doubleUsed bygetNumericalMean().protected doubleUsed bygetNumericalVariance().doublecumulativeProbability(int x) For a random variableXwhose values are distributed according to this distribution, this method returnsP(X <= x).doubleGet the exponent characterizing the distribution.intGet the number of elements (e.g. corpus size) for the distribution.doubleUse this method to get the numerical value of the mean of this distribution.doubleUse this method to get the numerical value of the variance of this distribution.intAccess the lower bound of the support.intAccess the upper bound of the support.booleanUse this method to get information about whether the support is connected, i.e. whether all integers between the lower and upper bound of the support are included in the support.doublelogProbability(int x) For a random variableXwhose values are distributed according to this distribution, this method returnslog(P(X = x)), wherelogis the natural logarithm.doubleprobability(int x) For a random variableXwhose values are distributed according to this distribution, this method returnsP(X = x).intsample()Generate a random value sampled from this distribution.Methods inherited from class AbstractIntegerDistribution
cumulativeProbability, inverseCumulativeProbability, reseedRandomGenerator, sample, solveInverseCumulativeProbability
-
Constructor Details
-
ZipfDistribution
Create a new Zipf distribution with the given number of elements and exponent.Note: this constructor will implicitly create an instance of
Well19937cas random generator to be used for sampling only (seesample()andAbstractIntegerDistribution.sample(int)). In case no sampling is needed for the created distribution, it is advised to passnullas random generator via the appropriate constructors to avoid the additional initialisation overhead.- Parameters:
numberOfElements- Number of elements.exponent- Exponent.- Throws:
NotStrictlyPositiveException- ifnumberOfElements <= 0orexponent <= 0.
-
ZipfDistribution
public ZipfDistribution(RandomGenerator rng, int numberOfElements, double exponent) throws NotStrictlyPositiveException Creates a Zipf distribution.- Parameters:
rng- Random number generator.numberOfElements- Number of elements.exponent- Exponent.- Throws:
NotStrictlyPositiveException- ifnumberOfElements <= 0orexponent <= 0.- Since:
- 3.1
-
-
Method Details
-
getNumberOfElements
Get the number of elements (e.g. corpus size) for the distribution.- Returns:
- the number of elements
-
getExponent
-
probability
For a random variableXwhose values are distributed according to this distribution, this method returnsP(X = x). In other words, this method represents the probability mass function (PMF) for the distribution.- Parameters:
x- the point at which the PMF is evaluated- Returns:
- the value of the probability mass function at
x
-
logProbability
For a random variableXwhose values are distributed according to this distribution, this method returnslog(P(X = x)), wherelogis the natural logarithm. In other words, this method represents the logarithm of the probability mass function (PMF) for the distribution. Note that due to the floating point precision and under/overflow issues, this method will for some distributions be more precise and faster than computing the logarithm ofIntegerDistribution.probability(int).The default implementation simply computes the logarithm of
probability(x).- Overrides:
logProbabilityin classAbstractIntegerDistribution- Parameters:
x- the point at which the PMF is evaluated- Returns:
- the logarithm of the value of the probability mass function at
x
-
cumulativeProbability
For a random variableXwhose values are distributed according to this distribution, this method returnsP(X <= x). In other words, this method represents the (cumulative) distribution function (CDF) for this distribution.- Parameters:
x- the point at which the CDF is evaluated- Returns:
- the probability that a random variable with this
distribution takes a value less than or equal to
x
-
getNumericalMean
Use this method to get the numerical value of the mean of this distribution. For number of elementsNand exponents, the mean isHs1 / Hs, whereHs1 = generalizedHarmonic(N, s - 1),Hs = generalizedHarmonic(N, s).
- Returns:
- the mean or
Double.NaNif it is not defined
-
calculateNumericalMean
Used bygetNumericalMean().- Returns:
- the mean of this distribution
-
getNumericalVariance
Use this method to get the numerical value of the variance of this distribution. For number of elementsNand exponents, the mean is(Hs2 / Hs) - (Hs1^2 / Hs^2), whereHs2 = generalizedHarmonic(N, s - 2),Hs1 = generalizedHarmonic(N, s - 1),Hs = generalizedHarmonic(N, s).
- Returns:
- the variance (possibly
Double.POSITIVE_INFINITYorDouble.NaNif it is not defined)
-
calculateNumericalVariance
Used bygetNumericalVariance().- Returns:
- the variance of this distribution
-
getSupportLowerBound
Access the lower bound of the support. This method must return the same value asinverseCumulativeProbability(0). In other words, this method must return
The lower bound of the support is always 1 no matter the parameters.inf {x in Z | P(X invalid input: '<'= x) > 0}.- Returns:
- lower bound of the support (always 1)
-
getSupportUpperBound
Access the upper bound of the support. This method must return the same value asinverseCumulativeProbability(1). In other words, this method must return
The upper bound of the support is the number of elements.inf {x in R | P(X invalid input: '<'= x) = 1}.- Returns:
- upper bound of the support
-
isSupportConnected
Use this method to get information about whether the support is connected, i.e. whether all integers between the lower and upper bound of the support are included in the support. The support of this distribution is connected.- Returns:
true
-
sample
Generate a random value sampled from this distribution. The default implementation uses the inversion method.- Specified by:
samplein interfaceIntegerDistribution- Overrides:
samplein classAbstractIntegerDistribution- Returns:
- a random value
-