Class Quantile

java.lang.Object
org.apache.commons.statistics.descriptive.Quantile

public final class Quantile extends Object
Provides quantile computation.

For values of length n:

  • The result is NaN if n = 0.
  • The result is values[0] if n = 1.
  • Otherwise the result is computed using the Quantile.EstimationMethod.

Computation of multiple quantiles and will handle duplicate and unordered probabilities. Passing ordered probabilities is recommended if the order is already known as this can improve efficiency; for example using uniform spacing through the array data, or to identify extreme values from the data such as [0.001, 0.999].

This implementation respects the ordering imposed by Double.compare(double, double) for NaN values. If a NaN occurs in the selected positions in the fully sorted values then the result is NaN.

The NaNPolicy can be used to change the behaviour on NaN values.

Instances of this class are immutable and thread-safe.

Since:
1.1
See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static enum 
    Estimation methods for a quantile.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private final boolean
    Flag to indicate if the data should be copied.
    private static final Quantile
    Default instance.
    Estimation type used to determine the value from the quantile.
    private static final String
    Message when the number of probabilities in a range is not valid.
    private static final String
    Message when the probability is not in the range [0, 1].
    private static final String
    Message when the size is not valid.
    private final NaNPolicy
    NaN policy for floating point data.
    private final NaNTransformer
    Transformer for NaN data.
    private static final String
    Message when no probabilities are provided for the varargs method.
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    private
    Quantile(boolean copy, NaNPolicy nanPolicy, Quantile.EstimationMethod estimationType)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private static void
    Check the number of probabilities n is strictly positive.
    private static void
    checkProbabilities(double... p)
    Check the probabilities p are in the range [0, 1].
    private static void
    checkProbability(double p)
    Check the probability p is in the range [0, 1].
    private static void
    checkSize(int n)
    Check the size is positive.
    private double
    compute(double[] values, int from, int to, double p)
    Compute the p-th quantile of the specified range of values.
    private double[]
    compute(double[] values, int from, int to, double... p)
    Compute the p-th quantiles of the specified range of values.
    private double
    compute(int[] values, int from, int to, double p)
    Compute the p-th quantile of the specified range of values.
    private double[]
    compute(int[] values, int from, int to, double... p)
    Evaluate the p-th quantiles of the specified range of values..
    private int[]
    computeIndices(int n, double[] p, double[] q, int offset)
    Compute the indices required for quantile interpolation.
    double
    evaluate(double[] values, double p)
    Evaluate the p-th quantile of the values.
    double[]
    evaluate(double[] values, double... p)
    Evaluate the p-th quantiles of the values.
    double
    evaluate(int[] values, double p)
    Evaluate the p-th quantile of the values.
    double[]
    evaluate(int[] values, double... p)
    Evaluate the p-th quantiles of the values.
    double
    evaluate(int n, IntToDoubleFunction values, double p)
    Evaluate the p-th quantile of the values.
    double[]
    evaluate(int n, IntToDoubleFunction values, double... p)
    Evaluate the p-th quantiles of the values.
    double
    evaluateRange(double[] values, int from, int to, double p)
    Evaluate the p-th quantile of the specified range of values.
    double[]
    evaluateRange(double[] values, int from, int to, double... p)
    Evaluate the p-th quantiles of the specified range of values.
    double
    evaluateRange(int[] values, int from, int to, double p)
    Evaluate the p-th quantile of the specified range of values.
    double[]
    evaluateRange(int[] values, int from, int to, double... p)
    Evaluate the p-th quantiles of the specified range of values..
    static double[]
    Generate n evenly spaced probabilities in the range [0, 1].
    static double[]
    probabilities(int n, double p1, double p2)
    Generate n evenly spaced probabilities in the range [p1, p2].
    Return an instance with the configured NaNPolicy.
    Return an instance with the configured Quantile.EstimationMethod.
    withCopy(boolean v)
    Return an instance with the configured copy behaviour.
    static Quantile
    Return a new instance with the default options.

    Methods inherited from class Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • INVALID_PROBABILITY

      private static final String INVALID_PROBABILITY
      Message when the probability is not in the range [0, 1].
      See Also:
    • NO_PROBABILITIES_SPECIFIED

      private static final String NO_PROBABILITIES_SPECIFIED
      Message when no probabilities are provided for the varargs method.
      See Also:
    • INVALID_SIZE

      private static final String INVALID_SIZE
      Message when the size is not valid.
      See Also:
    • INVALID_NUMBER_OF_PROBABILITIES

      private static final String INVALID_NUMBER_OF_PROBABILITIES
      Message when the number of probabilities in a range is not valid.
      See Also:
    • DEFAULT

      private static final Quantile DEFAULT
      Default instance. Method 8 is recommended by Hyndman and Fan.
    • copy

      private final boolean copy
      Flag to indicate if the data should be copied.
    • nanPolicy

      private final NaNPolicy nanPolicy
      NaN policy for floating point data.
    • nanTransformer

      private final NaNTransformer nanTransformer
      Transformer for NaN data.
    • estimationType

      private final Quantile.EstimationMethod estimationType
      Estimation type used to determine the value from the quantile.
  • Constructor Details

    • Quantile

      private Quantile(boolean copy, NaNPolicy nanPolicy, Quantile.EstimationMethod estimationType)
      Parameters:
      copy - Flag to indicate if the data should be copied.
      nanPolicy - NaN policy.
      estimationType - Estimation type used to determine the value from the quantile.
  • Method Details

    • withDefaults

      public static Quantile withDefaults()
      Return a new instance with the default options.

      Note: The default options configure for processing in-place and including NaN values in the data. This is the most efficient mode and has the smallest memory consumption.

      Returns:
      the quantile implementation
      See Also:
    • withCopy

      public Quantile withCopy(boolean v)
      Return an instance with the configured copy behaviour. If false then the input array will be modified by the call to evaluate the quantiles; otherwise the computation uses a copy of the data.
      Parameters:
      v - Value.
      Returns:
      an instance
    • with

      public Quantile with(NaNPolicy v)
      Return an instance with the configured NaNPolicy.

      Note: This implementation respects the ordering imposed by Double.compare(double, double) for NaN values: NaN is considered greater than all other values, and all NaN values are equal. The NaNPolicy changes the computation of the statistic in the presence of NaN values.

      • NaNPolicy.INCLUDE: NaN values are moved to the end of the data; the size of the data includes the NaN values and the quantile will be NaN if any value used for quantile interpolation is NaN.
      • NaNPolicy.EXCLUDE: NaN values are moved to the end of the data; the size of the data excludes the NaN values and the quantile will never be NaN for non-zero size. If all data are NaN then the size is zero and the result is NaN.
      • NaNPolicy.ERROR: An exception is raised if the data contains NaN values.

      Note that the result is identical for all policies if no NaN values are present.

      Parameters:
      v - Value.
      Returns:
      an instance
    • with

      Return an instance with the configured Quantile.EstimationMethod.
      Parameters:
      v - Value.
      Returns:
      an instance
    • probabilities

      public static double[] probabilities(int n)
      Generate n evenly spaced probabilities in the range [0, 1].
      1/(n + 1), 2/(n + 1), ..., n/(n + 1)
      
      Parameters:
      n - Number of probabilities.
      Returns:
      the probabilities
      Throws:
      IllegalArgumentException - if n < 1
    • probabilities

      public static double[] probabilities(int n, double p1, double p2)
      Generate n evenly spaced probabilities in the range [p1, p2].
      w = p2 - p1
      p1 + w/(n + 1), p1 + 2w/(n + 1), ..., p1 + nw/(n + 1)
      
      Parameters:
      n - Number of probabilities.
      p1 - Lower probability.
      p2 - Upper probability.
      Returns:
      the probabilities
      Throws:
      IllegalArgumentException - if n < 1; if the probabilities are not in the range [0, 1]; or p2 <= p1.
    • evaluate

      public double evaluate(double[] values, double p)
      Evaluate the p-th quantile of the values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Performance

      It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluate(double[], double...) method should be used which provides better performance.

      Parameters:
      values - Values.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if the probability p is not in the range [0, 1]; or if the values contain NaN and the configuration is NaNPolicy.ERROR
      See Also:
    • evaluateRange

      public double evaluateRange(double[] values, int from, int to, double p)
      Evaluate the p-th quantile of the specified range of values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Performance

      It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluateRange(double[], int, int, double...) method should be used which provides better performance.

      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if the probability p is not in the range [0, 1]; or if the values contain NaN and the configuration is NaNPolicy.ERROR
      IndexOutOfBoundsException - if the sub-range is out of bounds
      Since:
      1.2
      See Also:
    • compute

      private double compute(double[] values, int from, int to, double p)
      Compute the p-th quantile of the specified range of values.
      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if the probability p is not in the range [0, 1]
    • evaluate

      public double[] evaluate(double[] values, double... p)
      Evaluate the p-th quantiles of the values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Parameters:
      values - Values.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if any probability p is not in the range [0, 1]; no probabilities are specified; or if the values contain NaN and the configuration is NaNPolicy.ERROR
      See Also:
    • evaluateRange

      public double[] evaluateRange(double[] values, int from, int to, double... p)
      Evaluate the p-th quantiles of the specified range of values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if any probability p is not in the range [0, 1]; no probabilities are specified; or if the values contain NaN and the configuration is NaNPolicy.ERROR
      IndexOutOfBoundsException - if the sub-range is out of bounds
      Since:
      1.2
      See Also:
    • compute

      private double[] compute(double[] values, int from, int to, double... p)
      Compute the p-th quantiles of the specified range of values.
      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
    • evaluate

      public double evaluate(int[] values, double p)
      Evaluate the p-th quantile of the values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Performance

      It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluate(int[], double...) method should be used which provides better performance.

      Parameters:
      values - Values.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if the probability p is not in the range [0, 1]
      See Also:
    • evaluateRange

      public double evaluateRange(int[] values, int from, int to, double p)
      Evaluate the p-th quantile of the specified range of values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Performance

      It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluateRange(int[], int, int, double...) method should be used which provides better performance.

      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if the probability p is not in the range [0, 1]
      IndexOutOfBoundsException - if the sub-range is out of bounds
      Since:
      1.2
      See Also:
    • compute

      private double compute(int[] values, int from, int to, double p)
      Compute the p-th quantile of the specified range of values.
      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if the probability p is not in the range [0, 1]
    • evaluate

      public double[] evaluate(int[] values, double... p)
      Evaluate the p-th quantiles of the values.

      Note: This method may partially sort the input values if not configured to copy the input data.

      Parameters:
      values - Values.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
    • evaluateRange

      public double[] evaluateRange(int[] values, int from, int to, double... p)
      Evaluate the p-th quantiles of the specified range of values..

      Note: This method may partially sort the input values if not configured to copy the input data.

      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
      IndexOutOfBoundsException - if the sub-range is out of bounds
      Since:
      1.2
    • compute

      private double[] compute(int[] values, int from, int to, double... p)
      Evaluate the p-th quantiles of the specified range of values..

      Note: This method may partially sort the input values if not configured to copy the input data.

      Parameters:
      values - Values.
      from - Inclusive start of the range.
      to - Exclusive end of the range.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
    • evaluate

      public double evaluate(int n, IntToDoubleFunction values, double p)
      Evaluate the p-th quantile of the values.

      This method can be used when the values of known size are already sorted.

      short[] x = ...
      Arrays.sort(x);
      double q = Quantile.withDefaults().evaluate(x.length, i -> x[i], 0.05);
      
      Parameters:
      n - Size of the values.
      values - Values function.
      p - Probability for the quantile to compute.
      Returns:
      the quantile
      Throws:
      IllegalArgumentException - if size < 0; or if the probability p is not in the range [0, 1].
    • evaluate

      public double[] evaluate(int n, IntToDoubleFunction values, double... p)
      Evaluate the p-th quantiles of the values.

      This method can be used when the values of known size are already sorted.

      short[] x = ...
      Arrays.sort(x);
      double[] q = Quantile.withDefaults().evaluate(x.length, i -> x[i], 0.25, 0.5, 0.75);
      
      Parameters:
      n - Size of the values.
      values - Values function.
      p - Probabilities for the quantiles to compute.
      Returns:
      the quantiles
      Throws:
      IllegalArgumentException - if size < 0; if any probability p is not in the range [0, 1]; or no probabilities are specified.
    • checkProbability

      private static void checkProbability(double p)
      Check the probability p is in the range [0, 1].
      Parameters:
      p - Probability for the quantile to compute.
      Throws:
      IllegalArgumentException - if the probability is not in the range [0, 1]
    • checkProbabilities

      private static void checkProbabilities(double... p)
      Check the probabilities p are in the range [0, 1].
      Parameters:
      p - Probabilities for the quantiles to compute.
      Throws:
      IllegalArgumentException - if any probabilities p is not in the range [0, 1]; or no probabilities are specified.
    • checkSize

      private static void checkSize(int n)
      Check the size is positive.
      Parameters:
      n - Size of the values.
      Throws:
      IllegalArgumentException - if size < 0
    • checkNumberOfProbabilities

      private static void checkNumberOfProbabilities(int n)
      Check the number of probabilities n is strictly positive.
      Parameters:
      n - Number of probabilities.
      Throws:
      IllegalArgumentException - if c < 1
    • computeIndices

      private int[] computeIndices(int n, double[] p, double[] q, int offset)
      Compute the indices required for quantile interpolation.

      The zero-based interpolation index in [0, n) is saved into the working array q for each p.

      The indices are incremented by the provided offset to allow addressing sub-ranges of a larger array.

      Parameters:
      n - Size of the data.
      p - Probabilities for the quantiles to compute.
      q - Working array for quantiles in [0, n).
      offset - Array offset.
      Returns:
      the indices in [offset, offset + n)