Package uk.ac.starlink.ttools.filter
Class UnivariateStats
java.lang.Object
uk.ac.starlink.ttools.filter.UnivariateStats
Calculates univariate statistics for a variable.
Feed data to an instance of this object by repeatedly calling
acceptDatum(java.lang.Object, long)
and then call the various accessor methods to
get accumulated values.- Since:
- 27 Apr 2006
- Author:
- Mark Taylor
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interface
Aggregates statistics acquired from a column whose values are fixed-length numeric arrays. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
Maximum value for cardinality counters. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract void
acceptDatum
(Object value, long irow) Submits a single value to the statistics accumulator.abstract void
addStats
(UnivariateStats other) Adds the accumulated content of a second UnivariateStats object to this one.static UnivariateStats
createStats
(Class<?> clazz, Supplier<Quantiler> qSupplier, boolean doCard) Factory method to construct an instance of this class for accumulating particular types of values.abstract UnivariateStats.ArrayStats
Returns an object containing statistics applicable to numeric-array-valued columns.abstract int
Returns the number of distinct non-null values submitted, if known.abstract long
getCount()
Returns the number of good (non-null) values accumulated.abstract Comparable
<?> Returns the maximum value submitted, if applicable.abstract long
Returns the sequence number of the maximum value submitted.abstract Comparable
<?> Returns the numeric minimum value submitted, if applicable.abstract long
Returns the sequence number of the minimum value submitted.abstract Quantiler
Returns a quantiler ready to provide quantile values, or null if quantiles were not gathered.abstract double
getSum()
Returns the numeric sum of values accumulated.abstract double
getSum2()
Returns the sum of squares of values accumulated.abstract double
getSum3()
Returns the sum of cubes of values accumulated.abstract double
getSum4()
Returns the sum of fourth powers of values accumulated.
-
Field Details
-
MAX_CARDINALITY
public static final int MAX_CARDINALITYMaximum value for cardinality counters.- See Also:
-
-
Constructor Details
-
UnivariateStats
public UnivariateStats()
-
-
Method Details
-
acceptDatum
Submits a single value to the statistics accumulator. The submitted value should be of a type compatible with the class type of this Stats object.- Parameters:
value
- value objectirow
- row index of input value
-
addStats
Adds the accumulated content of a second UnivariateStats object to this one.- Parameters:
other
- compatible UnivariateStats object
-
getCount
public abstract long getCount()Returns the number of good (non-null) values accumulated.- Returns:
- good value count
-
getSum
public abstract double getSum()Returns the numeric sum of values accumulated.- Returns:
- sum of values
-
getSum2
public abstract double getSum2()Returns the sum of squares of values accumulated.- Returns:
- sum of squared values
-
getSum3
public abstract double getSum3()Returns the sum of cubes of values accumulated.- Returns:
- sum of cubed values
-
getSum4
public abstract double getSum4()Returns the sum of fourth powers of values accumulated.- Returns:
- sum of fourth powers
-
getMinimum
Returns the numeric minimum value submitted, if applicable.- Returns:
- minimum
-
getMaximum
Returns the maximum value submitted, if applicable.- Returns:
- maximum
-
getMinPos
public abstract long getMinPos()Returns the sequence number of the minimum value submitted. Returns -1 if there is no minimum or if the sequence number is not known.- Returns:
- row index of minimum, or -1
-
getMaxPos
public abstract long getMaxPos()Returns the sequence number of the maximum value submitted. Returns -1 if there is no maximum or if the sequence number is not known.- Returns:
- row index of maximum, or -1
-
getCardinality
public abstract int getCardinality()Returns the number of distinct non-null values submitted, if known. If the count was not collected, or if there were too many different values to count, -1 is returned.- Returns:
- number of distinct non-null values, or -1
-
getQuantiler
Returns a quantiler ready to provide quantile values, or null if quantiles were not gathered. If a non-null quantiler is returned, theQuantiler.ready()
value will have been called on it.- Returns:
- ready quantiler, or null
-
getArrayStats
Returns an object containing statistics applicable to numeric-array-valued columns.- Returns:
- array stats object, or null
-
createStats
public static UnivariateStats createStats(Class<?> clazz, Supplier<Quantiler> qSupplier, boolean doCard) Factory method to construct an instance of this class for accumulating particular types of values.- Parameters:
clazz
- class of which all submitted values will be instances of (if they're not null)qSupplier
- supplier for an object that can calculate quantiles, or null if quantiles are not requireddoCard
- true if an attempt is to be made to count distinct values- Returns:
- stats accumulator
-