Automated Machine Learning (AutoML) with MATLAB
Get an overview of AutoML and how it simplifies the machine learning workflow. Learn how to build optimized predictive models in three steps:
- Apply wavelet scattering to obtain features from signal or image data without requiring signal processing expertise.
- Use automated feature selection to identify a small subset of features, which helps prevent overfitting, reduces model size, and speeds up training.
- Choose automated model selection to deliver an optimized model in one step with built-in hyperparameter optimization.
The video demonstrates how to apply AutoML to build a classifier of human activity and use sensor data from mobile devices as input.
AutoML in MATLAB delivers optimized models in just three steps. First, you convert raw sensor data to the features machine learning needs as input using wavelength scattering. Next, automated feature selection allows you to reduce large feature sets and thus ultimate model size. And finally, automated model selection picks the best model for you and optimizes its hyperparameters in the same step. You can apply these steps without machine learning expertise. However, this video uses some technical terms to explain what's happening behind the scenes.
We demonstrate autoML building a model to classify activities, such as standing or sitting, using accelerometer meta data from a mobile phone. The remainder of the video walks you through the steps of autoML in MATLAB. First, to obtain relevant features from signal or image data, you can decompose signals with wavelengths using predefined wavelet and scaling filters that feature a Matrix function, applies wavelet scattering to buffers of signal.
Next, we could proceed to train a model on the over 400 feature we obtained, but that may result in a large model that does not fit on an embedded device. So a second step in auoML, we apply feature selection to reduce the number of features. The table here helps you choose the appropriate method based on the characteristics of your data.
Here we apply the minimum redundancy maximum relevance algorithm, which works really well on continuous and categorical features for classification. The feature ranking charts suggest that as few as a dozen features capture the majority of the variability in the signal.
After selecting a small performance set of features, we can proceed to the third step-- identifying the best performing model. Use fitcauto for classification and fit [? R ?] auto for regression. We train on just the dozen features we selected in the previous step. The algorithm evaluates many combinations of models and hyperparameters seeking to minimize the error. In practice, this may take hundreds of iterations to fully converge. Though for data of moderate size, we see good results in much less time, around 100 iterations. You can speed up execution using parallelization on multiple cores, on your local computer, or using cloud instances.
Applying autoML to this data set, we obtain a model with 96.6% accuracy on held-out test data. In summary, autoML obtained a highly accurate model in a few steps, effectively allowing engineers to build the models themselves without having to rely on data scientists. If you are knowledgeable in machine learning, autoML saves you time on routine steps, allowing you to focus on advanced optimization techniques such as stacking models and engineering even better features. For more information, visit our autoML discovery page or the links below.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.