trimmean

Mean excluding outliers

Syntax

m = trimmean(X,percent)
trimmean(X,percent,dim)
m = trimmean(X,percent,flag)
m = trimmean(x,percent,flag,dim)

Description

m = trimmean(X,percent) calculates the trimmed mean of the values in X. For a vector input, m is the mean of X, excluding the highest and lowest k data values, where k=n*(percent/100)/2 and where n is the number of values in X. For a matrix input, m is a row vector containing the trimmed mean of each column of X. For n-D arrays, trimmean operates along the first non-singleton dimension. percent is a scalar between 0 and 100.

trimmean(X,percent,dim) takes the trimmed mean along dimension dim of X.

m = trimmean(X,percent,flag) controls how to trim when k is not an integer. flag can be chosen from the following:

'round'Round k to the nearest integer (round to a smaller integer if k is a half integer). This is the default.
'floor'Round k down to the next smaller integer.
'weight'If k=i+f where i is the integer part and f is the fraction, compute a weighted mean with weight (1-f) for the (i+1)th and (n-i)th values, and full weight for the values between them.

m = trimmean(x,percent,flag,dim) takes the trimmed mean along dimension dim of x.

Examples

expand all

Efficiency of the Trimmed Mean

Generate a 100-by-100 matrix of random numbers from the standard normal distribution. This represents 100 samples, each containing 100 data points.

rng default;  % For reproducibility
x = normrnd(0,1,100,100);

Compute the sample mean and the 10% trimmed mean for each column of the data matrix.

m = mean(x);
trim = trimmean(x,10);

Compute the efficiency of the 10% trimmed mean relative to the sample mean for the data.

sm = std(m);
strim = std(trim);
efficiency = (sm/strim).^2
efficiency =

    0.9663

Trimmed Mean for Distributions with Outliers

Generate random data from the t location-scale distribution, which tends to have outliers.

rng default;  % For reproducibility
x = trnd(1,40,1);

Visualize the distribution using a normal probability plot.

probplot(x)

Although the distribution is symmetric around zero, there are several outliers which will affect the mean. The trimmed mean is closer to zero, which is more representative of the data.

mean = mean(x)
tmean = trimmean(x,25)
mean =

    2.7991


tmean =

    0.8797

More About

expand all

Tips

The trimmed mean is a robust estimate of the location of a sample. If there are outliers in the data, the trimmed mean is a more representative estimate of the center of the body of the data than the mean. However, if the data is all from the same probability distribution, then the trimmed mean is less efficient than the sample mean as an estimator of the location of the data.

See Also

| | |

Was this topic helpful?