Choose a Regression Function
Regression is the process of fitting models to data. The models must have numerical responses. For models with categorical responses, see Parametric Classification or Supervised Learning Workflow and Algorithms. The regression process depends on the model. If a model is parametric, regression estimates the parameters from the data. If a model is linear in the parameters, estimation is based on methods from linear algebra that minimize the norm of a residual vector. If a model is nonlinear in the parameters, estimation is based on search methods from optimization that minimize the norm of a residual vector.
This table describes which function to use depending on the type of regression problem.
| Model Components | Result of Regression | Function to Use |
|---|---|---|
| Continuous or categorical predictors, continuous response, linear model | Fitted model coefficients | fitlm. See Linear Regression. |
| Continuous or categorical predictors, continuous response, linear model of unknown complexity | Fitted model and fitted coefficients | stepwiselm. See Stepwise Regression. |
| Continuous or categorical predictors, response possibly with restrictions such as nonnegative or integer-valued, generalized linear model | Fitted generalized linear model coefficients | fitglm or stepwiseglm. See Generalized Linear Models. |
| Continuous predictors with a continuous nonlinear response, parametrized nonlinear model | Fitted nonlinear model coefficients | fitnlm. See Nonlinear Regression. |
| Continuous predictors, continuous response, linear model | Set of models from ridge, lasso, or elastic net regression | lasso or ridge. See Lasso and Elastic Net or Ridge Regression. |
| Correlated continuous predictors, continuous response, linear model | Fitted model and fitted coefficients | plsregress. See Partial Least Squares. |
| Continuous or categorical predictors, continuous response, unknown model | Nonparametric model | fitrtree or fitrensemble. |
| Categorical predictors only | ANOVA | anova, anova1, anova2, anovan. |
| Continuous predictors, multivariable response, linear model | Fitted multivariate regression model coefficients | mvregress |
| Continuous predictors, continuous response, mixed-effects model | Fitted mixed-effects model coefficients | nlmefit or nlmefitsa. See Mixed-Effects Models. |
Update Legacy Code with New Fitting Methods
There are several Statistics and Machine Learning Toolbox™ functions for performing regression. The following sections describe how to replace calls to older functions to new versions:
regress into fitlm
Previous Syntax:
[b,bint,r,rint,stats] = regress(y,X)
where X contains a column of ones.
Current Syntax:
mdl = fitlm(X,y)
where you do not add a column of ones to X.
Equivalent values of the previous outputs:
b—mdl.Coefficients.Estimatebint—coefCI(mdl)r—mdl.Residuals.Rawrint— There is no exact equivalent. Try examiningmdl.Residuals.Studentizedto find outliers.stats—mdlcontains various properties that replace components ofstats.
regstats into fitlm
Previous Syntax:
stats = regstats(y,X,model,whichstats)
Current Syntax:
mdl = fitlm(X,y,model)
Obtain statistics from the properties and methods of the LinearModel object (mdl). For example, see the mdl.Diagnostics and mdl.Residuals properties.
robustfit into fitlm
Previous Syntax:
[b,stats] = robustfit(X,y,wfun,tune,const)
Current Syntax:
mdl = fitlm(X,y,'robust','on') % bisquare
Or to use the wfun weight and the tune tuning parameter:
opt.RobustWgtFun = 'wfun'; opt.Tune = tune; % optional mdl = fitlm(X,y,'robust',opt)
Obtain statistics from the properties and methods of the LinearModel object (mdl). For example, see the mdl.Diagnostics and mdl.Residuals properties.
stepwisefit into stepwiselm
Previous Syntax:
[b,se,pval,inmodel,stats,nextstep,history] = stepwisefit(X,y,Name,Value)
Current Syntax:
mdl = stepwiselm(ds,modelspec,Name,Value)
or
mdl = stepwiselm(X,y,modelspec,Name,Value)
Obtain statistics from the properties and methods of the LinearModel object (mdl). For example, see the mdl.Diagnostics and mdl.Residuals properties.
glmfit into fitglm
Previous Syntax:
[b,dev,stats] = glmfit(X,y,distr,param1,val1,...)
Current Syntax:
mdl = fitglm(X,y,distr,...)
Obtain statistics from the properties and methods of the GeneralizedLinearModel object (mdl). For example, the deviance is mdl.Deviance, and to compare mdl against a constant model, use devianceTest(mdl).
nlinfit into fitnlm
Previous Syntax:
[beta,r,J,COVB,mse] = nlinfit(X,y,fun,beta0,options)
Current Syntax:
mdl = fitnlm(X,y,fun,beta0,'Options',options)
Equivalent values of the previous outputs:
beta—mdl.Coefficients.Estimater—mdl.Residuals.Rawcovb—mdl.CoefficientCovariancemse—mdl.mse
mdl does not provide the Jacobian (J) output. The primary purpose of J was to pass it into nlparci or nlpredci to obtain confidence intervals for the estimated coefficients (parameters) or predictions. Obtain those confidence intervals as:
parci = coefCI(mdl) [pred,predci] = predict(mdl)