Main Content

Remove Outliers

Remove Outliers Interactively

To remove outliers in the Curve Fitter app, follow these steps:

  1. In the plot axes toolbar, click the Exclude outliers button .

    When you move the mouse cursor to the plot, it changes to a cross-hair to show that you are in outlier selection mode.

  2. Click a point that you want to exclude in the fit plot or residuals plot. Alternatively, click and drag to define a rectangle and remove all enclosed points.

    A removed plot point becomes a red cross in the plots. If you have Auto fitting selected in the Fit section of the Curve Fitter tab, the Curve Fitter app refits the surface without the point. Otherwise, if you have Manual fitting selected, you can click Fit to refit.

  3. Repeat the process for all points you want to exclude.

When removing outliers from surface fits, it can be helpful to display a 2-D residuals plot for examining and removing outliers. With your plot cursor in rotation mode, right-click the plot to select Go to X-Y view, Go to X-Z view, or Go to Y-Z view.

To replace individual excluded points in the fit, click an excluded point again in outlier selection mode (that is, with the Exclude outliers button toggled on in the axes toolbar). To replace all excluded points in the fit, right-click and select Clear all exclusions.

In surface plots, to return to rotation mode, click the Exclude outliers button again to turn off outlier selection mode.

Exclude Data Ranges

To exclude sections of data by range in the Curve Fitter app, follow these steps:

  1. On the Curve Fitter tab, in the Data section, click Exclusion Rules.

  2. In the Exclusion Rules dialog box, specify data to exclude. Enter numbers in any of the boxes to define beginning or ending intervals to exclude in the X, Y, or Z data.

    The Curve Fitter app displays shaded pink areas on the plots to show excluded ranges. Excluded points become red.

Remove Outliers Programmatically

This example shows how to remove outliers when curve fitting programmatically, using the 'Exclude' name/value pair argument with the fit or fitoptions functions. You can plot excluded data by supplying an Exclude or outliers argument with the plot function.

Exclude Data Using a Simple Rule

For a simple example, load data and fit a Gaussian distribution, excluding some data with an expression. Then plot the fit, data and the excluded points.

[x, y] = titanium;
f1 = fit(x',y','gauss2','Exclude',x<800);
plot(f1,x,y,x<800)

Figure contains an axes object. The axes object with xlabel x, ylabel y contains 3 objects of type line. One or more of the lines displays its values using only markers These objects represent data, excluded data, fitted curve.

Exclude Data by Distance from the Model

It can be useful to exclude outliers by distance from the model, using standard deviations. The following example shows how to identify outliers using distance greater than 1.5 standard deviations from the model, and compares with a robust fit which gives lower weight to outliers.

Create a baseline sinusoidal signal:

xdata = (0:0.1:2*pi)'; 
y0 = sin(xdata);

Add noise to the signal with non-constant variance:

% Response-dependent Gaussian noise
gnoise = y0.*randn(size(y0));

% Salt-and-pepper noise
spnoise = zeros(size(y0)); 
p = randperm(length(y0));
sppoints = p(1:round(length(p)/5));
spnoise(sppoints) = 5*sign(y0(sppoints));

ydata = y0 + gnoise + spnoise;

Fit the noisy data with a baseline sinusoidal model:

f = fittype('a*sin(b*x)'); 
fit1 = fit(xdata,ydata,f,'StartPoint',[1 1]);

Identify outliers as points at a distance greater than 1.5 standard deviations from the baseline model, and refit the data with the outliers excluded:

fdata = feval(fit1,xdata); 
I = abs(fdata - ydata) > 1.5*std(ydata); 
outliers = excludedata(xdata,ydata,'indices',I);

fit2 = fit(xdata,ydata,f,'StartPoint',[1 1],...
           'Exclude',outliers);

Compare the effect of excluding the outliers with the effect of giving them lower bisquare weight in a robust fit:

fit3 = fit(xdata,ydata,f,'StartPoint',[1 1],'Robust','on');

Plot the data, the outliers, and the results of the fits:

plot(fit1,'r-',xdata,ydata,'k.',outliers,'m*') 
hold on
plot(fit2,'c--')
plot(fit3,'b:')
xlim([0 2*pi])

Figure contains an axes object. The axes object with xlabel x, ylabel y contains 5 objects of type line. One or more of the lines displays its values using only markers These objects represent data, excluded data, fitted curve.

See Also

|

Related Topics