Preprocessing Data

Data cleaning, smoothing, grouping

Data sets can require preprocessing techniques to ensure accurate, efficient, or meaningful analysis. Data cleaning refers to methods for finding, removing, and replacing bad or missing data. Detecting local extrema and abrupt changes can help to identify significant data trends. Smoothing and detrending are processes for removing noise and linear trends from data, while scaling changes the bounds of the data. Grouping and binning methods are techniques that identify relationships among the data variables.


expand all

ismissingFind missing values
rmmissingRemove missing entries
fillmissingFill missing values
missingCreate missing values
standardizeMissingInsert standard missing values
isoutlierFind outliers in data
filloutliersDetect and replace outliers in data
rmoutliersDetect and remove outliers in data
movmadMoving median absolute deviation
ischangeFind abrupt changes in data
islocalminFind local minima
islocalmaxFind local maxima
smoothdataSmooth noisy data
movmeanMoving mean
movmedianMoving median
detrendRemove polynomial trend
normalizeNormalize data
rescaleScale range of array elements
discretizeGroup data into bins or categories
groupcountsNumber of group elements
groupsummaryGroup summary computations
grouptransformTransform by group
histcountsHistogram bin counts
histcounts2Bivariate histogram bin counts
findgroupsFind groups and return group numbers
splitapplySplit data into groups and apply function
rowfunApply function to table or timetable rows
varfunApply function to table or timetable variables
accumarrayConstruct array with accumulation


Missing Data in MATLAB

Handle missing values in data sets.

Clean Messy and Missing Data in Tables

This example shows how to find, clean, and delete table rows with missing data.

Data Smoothing and Outlier Detection

Eliminate unwanted noise or behavior in data, and find, fill, and remove outliers.

Detrending Data

Remove linear trends from data.

Grouping Variables To Split Data

You can use grouping variables to categorize data variables.

Split Data into Groups and Calculate Statistics

This example shows how to group data and apply statistics functions to each group.

Split Table Data Variables and Apply Functions

This example shows how to group data variables and apply functions to each group.