User Tools

Site Tools


sift:build_nds:nd_summary_statistics

ND Summary Statistics

The Normal Database (ND) feature in Sift includes a library of summary statistics that the user can choose from.

Here we list all available summary statistics that can be computed against the ND library.

If there are statistics you wish to calculate they are not included here, email info@has-motion.ca and let us know!

Measures of Central Tendencies

Mean

Definition: The sum of all values divided by the total number of values.

Equation:

Xi = each score in the dataset
n= sample size

Note: this measure is not a weighted mean or mean of means. If measuring mean of a workspace or library summary level in the Normal Database, the mean is measured across all traces from all trials and workspaces (if Library level summary) in the dataset.


Median

Definition: The value lying at the midpoint of the sorted dataset.

Equation:

  • if odd
  • if even
  • n is the data set/sample size
  • X is the ordered list of values in the data set

Mode

Definition: The value that appears most often in a data set.


Trimean

Definition: A weighted average of the median and quartiles of the dataset.

Equation: (quartile1 + median*2 + quartile3)/4


Geometric Mean

Definition: An average of a data set using the product of the values rather than the sum. The nth root of the product of n positive numbers.

Equation:


Trimmed Mean

Definition: Calculates the mean after removing a percentage of the smallest and largest values in the dataset. In this case, we are using a trimmed mean of 10%

Equation:

Xi = each score in the trimmed data set in ascending order
n= sample size of the trimmed data set


Measures of Dispersion

Standard Deviation

Definition: The square root of the variance of the dataset relative to it's mean.

Equation:

Xi = each score in the data set
u = mean of the data set
n= sample size of the data set


Variance

Definition: The mean of square differences between each data point and the mean of the data set.

Equation:

Xi = each score in the data set
u = mean of the data set
n= sample size of the data set


Range

Definition: The size of the narrowest interval that contains all data points in the dataset. Found as the difference between the largest and smallest values in the data set.

Equation:

X = the data set


Quartiles

Definition: The division of a data set (in ascending order) into four equal portions, each containing 25% of the data points.

Equation:

N = the number of points in the data set


Interquartile Range

Definition: The range of the middle 50% of the data set. Found by finding the difference between the third quartile (75th percentile) and first quartile (25th percentile).

Equation:


Percentiles

Definition: The value of a sorted data set where x% of the data is below that value.

Equation:

x = percentile of interest
Pi = index of xth percentile in the data set
n = total number of points in the data set
X = sorted data set
Px = the data value at index Pi

Note: in this implementation, we provide the 5th and 95th percentiles.


Mean Absolute Deviation

Definition: The measure of variability representing the average distance of each data point and the mean of the data set.

Equation:
N = the number of points in the data set
u = the mean of the data set
Xi = each data point in the data set


Median Absolute Deviation

Definition: The measure of variability representing the average distance of each data point and the median of the data set.

Equation:
N = the number of points in the data set
u = the median of the data set
Xi = each data point in the data set

sift/build_nds/nd_summary_statistics.txt · Last modified: 2024/12/18 14:47 by wikisysop