Class KolmogorovSmirnovDistribution.One
- java.lang.Object
-
- org.apache.commons.statistics.inference.KolmogorovSmirnovDistribution.One
-
- Enclosing class:
- KolmogorovSmirnovDistribution
static final class KolmogorovSmirnovDistribution.One extends java.lang.ObjectComputes the complementary probabilityP[D_n^+ >= x]for the one-sided one-sample Kolmogorov-Smirnov distribution.D_n^+ = sup_x {CDF_n(x) - F(x)}where
nis the sample size;CDF_n(x)is an empirical cumulative distribution function; andF(x)is the expected distribution. The computation uses Smirnov's stable formula:floor(n(1-x)) (n) ( j ) (j-1) ( j ) (n-j) P[D_n^+ >= x] = x Sum ( ) ( - + x ) ( 1 - x - - ) j=0 (j) ( n ) ( n )Computing using logs is not as accurate as direct multiplication when n is large. However the terms are very large and small. Multiplication uses a scaled representation with a separate exponent term to support the extreme range. Extended precision representation of the numbers reduces the error in the power terms. Details in van Mulbregt (2018).
References:
- van Mulbregt, P. (2018). Computing the Cumulative Distribution Function and Quantiles of the One-sided Kolmogorov-Smirnov Statistic arxiv:1802.06966.
- Magg & Dicaire (1971). On Kolmogorov-Smirnov Type One-Sample Statistics Biometrika 58.3 pp. 653–656.
- Since:
- 1.1
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static interfaceKolmogorovSmirnovDistribution.One.ScaledPowerDefines a scaled power function.
-
Field Summary
Fields Modifier and Type Field Description private static KolmogorovSmirnovDistribution.One.ScaledPowerPOWER_DEFAULTProxy for the default choice of the scaled power function.private static intSD_MAX_TERMSMaximum number of term for the Smirnov-Dwass algorithm.private static intSD_MIN_NMinimum sample size for the Smirnov-Dwass algorithm.private static intSD_SUM_PRECISION_BITSNumber of bits of precision in the sum of terms Aj.private static intSUM_PRECISION_BITSNumber of bits of precision in the sum of terms Aj.private static intVERY_LARGE_N"Very large" n to use a asymptotic limiting form.
-
Constructor Summary
Constructors Modifier Constructor Description privateOne()No instances.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static intlog2(int n)Returnsfloor(log2(n)).(package private) static doublesf(double x, int n)Calculates complementary probabilityP[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.(package private) static doublesf(double x, int n, KolmogorovSmirnovDistribution.One.ScaledPower power)Calculates complementary probabilityP[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.private static doublesfAsymptotic(double x, int n)Calculates complementary probabilityP[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.private static doublesfExact(double x, int n)Calculates exact cases for the complementary probabilityP[D_n^+ >= x]the one-sided one-sample Kolmogorov-Smirnov distribution.(package private) static intsplitX(int n, double x, double[] alpha)Compute exactlyx = (k + alpha) / nwithkan integer andalpha in [0, 1).
-
-
-
Field Detail
-
VERY_LARGE_N
private static final int VERY_LARGE_N
"Very large" n to use a asymptotic limiting form. [1] suggests 1e12 but this is reduced to avoid excess computation time.- See Also:
- Constant Field Values
-
SD_MAX_TERMS
private static final int SD_MAX_TERMS
Maximum number of term for the Smirnov-Dwass algorithm.- See Also:
- Constant Field Values
-
SD_MIN_N
private static final int SD_MIN_N
Minimum sample size for the Smirnov-Dwass algorithm.- See Also:
- Constant Field Values
-
SUM_PRECISION_BITS
private static final int SUM_PRECISION_BITS
Number of bits of precision in the sum of terms Aj. This does not have to be the full 106 bits of a double-double as the final result is used as a double. The terms are represented as fractions with an exponent:Aj = 2^b * f f of sum(A) in [0.5, 1) f of Aj in [0.25, 2]
The terms can be added if their exponents overlap. The bits of precision must account for the extra range of the fractional part of Aj by 1 bit. Note that additional bits are added to this dynamically based on the number of terms.
- See Also:
- Constant Field Values
-
SD_SUM_PRECISION_BITS
private static final int SD_SUM_PRECISION_BITS
Number of bits of precision in the sum of terms Aj. For Smirnov-Dwass we use the full 106 bits of a double-double due to the summation of terms that cancel. Account for the extra range of the fractional part of Aj by 1 bit.- See Also:
- Constant Field Values
-
POWER_DEFAULT
private static final KolmogorovSmirnovDistribution.One.ScaledPower POWER_DEFAULT
Proxy for the default choice of the scaled power function. The actual choice is based on the chosen algorithm.
-
-
Method Detail
-
sf
static double sf(double x, int n)Calculates complementary probabilityP[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.- Parameters:
x- Statistic.n- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sfExact
private static double sfExact(double x, int n)Calculates exact cases for the complementary probabilityP[D_n^+ >= x]the one-sided one-sample Kolmogorov-Smirnov distribution.Exact cases handle x not in [0, 1]. It is assumed n is positive.
- Parameters:
x- Statistic.n- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sfAsymptotic
private static double sfAsymptotic(double x, int n)Calculates complementary probabilityP[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.Computes the result using the asymptotic formula Eq 5 in [1].
- Parameters:
x- Statistic.n- Sample size (assumed to be positive).- Returns:
- \(P(D_n^+ ≥ x)\)
-
sf
static double sf(double x, int n, KolmogorovSmirnovDistribution.One.ScaledPower power)Calculates complementary probabilityP[D_n^+ >= x], or survival function (SF), for the one-sided one-sample Kolmogorov-Smirnov distribution.Computes the result using double-double arithmetic. The power function can use a fast approximation or a full power computation.
This function is safe for
x > 1/n. Whenxapproaches sub-normal then division or multiplication by x can under/overflow. The case ofx < 1/ncan be computed insfExact.- Parameters:
x- Statistic (typically in (1/n, 1 - 1/n)).n- Sample size (assumed to be positive).power- Function to compute the scaled power (can be null).- Returns:
- \(P(D_n^+ ≥ x)\)
- See Also:
DD.pow(int, long[]),DDMath.pow(DD, int, long[])
-
splitX
static int splitX(int n, double x, double[] alpha)Compute exactlyx = (k + alpha) / nwithkan integer andalpha in [0, 1). Note thatk ~ floor(nx)but may be rounded up ifalpha -> 1within working precision.This computation is a significant source of increased error if performed in 64-bit arithmetic. Although the value alpha is only used for the PDF computation a value of
alpha == 0indicates the final term of the SF summation can be dropped due to the cancellation of a power term(x + j/n)to zero withx = (n-j)/n. That is ifalpha == 0then x is the fractionk/nand one Aj term is zero.- Parameters:
n- Sample size.x- Statistic.alpha- Output alpha.- Returns:
- k
-
log2
private static int log2(int n)
Returnsfloor(log2(n)).- Parameters:
n- Value.- Returns:
- approximate log2(n)
-
-