You can use Regression Learner to train regression models including linear regression models, regression trees, Gaussian process regression models, support vector machines, and ensembles of regression trees. In addition to training models, you can explore your data, select features, specify validation schemes, and evaluate results. You can export a model to the workspace to use the model with new data or generate MATLAB® code to learn about programmatic classification.
Training a model in Regression Learner consists of two parts:
Validated Model: Training a model with a validation scheme. By default, the app protects against overfitting by applying cross-validation. Alternatively, you can choose holdout validation. The validated model is visible in the app.
Full Model: Training a model on full data without validation. The app trains this model simultaneously with the validated model. However, the model trained on full data is not visible in the app. When you choose a regression model to export to the workspace, Regression Learner exports the full model.
The app displays the results of the validated model. Diagnostic measures, such as model accuracy, and plots, such as response plot or residuals plot reflect the validated model results. You can automatically train one or more regression models, compare validation results, and choose the best model that works for your regression problem. When you choose a model to export to the workspace, Regression Learner exports the full model. Because Regression Learner creates a model object of the full model during training, you experience no lag time when you export the model. You can use the exported model to make predictions on new data.
You can use Regression Learner to automatically train a selection of different regression models on your data.
Get started by automatically training multiple models simultaneously. You can quickly try a selection of models, and then explore promising models interactively.
If you already know what model type you want, then you can train individual models instead. See Manual Regression Model Training.
On the Apps tab, in the Machine Learning group, click Regression Learner.
Click New Session and select data from the workspace or from file. Specify a response variable and variables to use as predictors. See Select Data and Validation for Regression Problem.
On the Regression Learner tab, in the Model Type section, click the arrow to expand the list of regression models. Select All Quick-To-Train. This option trains all the model presets that are fast to fit.
Click Train .
If you have Parallel Computing Toolbox™, the app trains models in parallel. See Parallel Regression Model Training.
A selection of model types appears in the History list. When the models finish training, the best RMSE score is highlighted in a box.
Click models in the History list to explore results in the plots.
To try all the model presets available, click All , and then click Train.
To explore individual model types, you can train models one at a time or train a group of models of the same type.
Choose a model type. On the Regression Learner tab, in the Model Type section, click a model type. To see all available model options, click the arrow in the Model Type section to expand the list of regression models. The options in the gallery are preset starting points with different settings, suitable for a range of different regression problems.
To read descriptions of the models, switch to the details view or hover the mouse over a button to display its tooltip.
For more information on each option, see Choose Regression Model Options.
After selecting a model, click Train .
Repeat to explore different models.
Select regression trees first. If your trained models do not predict the response accurately enough, then try other models with higher flexibility. To avoid overfitting, look for a less flexible model that provides sufficient accuracy.
If you want to try all model types or train a group of the same type, then select one of the All options in the gallery.
For next steps, see Compare and Improve Regression Models.
You can train models in parallel using Regression Learner if you have Parallel Computing Toolbox. When you train models, the app automatically starts a parallel pool of workers, unless you turn off the default parallel preference Automatically create a parallel pool. If a pool is already open, the app uses it for training. Parallel training allows you to train multiple models simultaneously and continue working.
The first time you click Train, you see a dialog box while the app opens a parallel pool of workers. After the pool opens, you can train multiple models at once.
When models are training in parallel, you see progress indicators on each training and queued model in the History list. If you want, you can cancel individual models. During training, you can examine results and plots from models, and initiate training of more models.
To control parallel training, toggle the Use Parallel button on the app toolstrip. (The Use Parallel button is only available if you have Parallel Computing Toolbox.)
If you have Parallel
Computing Toolbox, then parallel training is available in Regression Learner, and you do
not need to set the
UseParallel option of the
statset function. If you turn off the parallel preference
to Automatically create a parallel pool, then the app does not
start a pool for you without asking first.
Click models in the History list to explore the results in the plots. Compare model performance by inspecting results in the plots. Examine the RMSE score reported in the History list for each model. See Assess Model Performance in Regression Learner.
Select the best model in the History list and then try including and excluding different features in the model. Click Feature Selection .
Try the response plot to help you identify features to remove. See if you can improve the model by removing features with low predictive power. Specify predictors to include in the model, and train new models using the new options. Compare results among the models in the History list.
You also can try transforming features with PCA to reduce dimensionality.
Improve the model further by changing model parameter settings in the Advanced dialog box. Then, train using the new options. To learn how to control model flexibility, see Choose Regression Model Options.
If feature selection, PCA, or new parameter settings improve your model, try training All model types with the new settings. See if another model type does better with the new settings.
To avoid overfitting, look for a less flexible model that provides sufficient accuracy. For example, look for simple models, such as regression trees that are fast and easy to interpret. If your models are not accurate enough, then try other models with higher flexibility, such as ensembles. To learn about the model flexibility, see Choose Regression Model Options.
This figure shows the app with a History list containing various regression model types.
For a step-by-step example comparing different regression models, see Train Regression Trees Using Regression Learner App.
Next, you can generate code to train the model with different data or export trained models to the workspace to make predictions using new data. See Export Regression Model to Predict New Data.