How can I reduce the size of machine learning model from classification learner app, to be used by my code?

5 views (last 30 days)
I have developed a module, a part of which uses (predict function) a ML model, generated and saved from Classification Learner App. The problem is with the larger size of the model and memory constraints from hardware. I have two queries :
  1. Is there any way where I can code the prediction model by using fit functions (example, take the trained model specs and code it instead)?
  2. If not, then how can I optimize and reduce the size of the trained model?
  2 Comments
Asvin Kumar
Asvin Kumar on 24 Jun 2021
Is there an example you can share of the model or the workflow (MATLAB example) that you are following?
Sharing more details might help get more responses from the community.
Oindri Mazumdar
Oindri Mazumdar on 24 Jun 2021
I am training the ML models using Classfication Learner App, using a dataset with 10s of thousands of entries for 32 features.
The most optimal performing model is then 'Exported as Compact Model' from the learner app into the workspace.
The model is then saved in the repos of algorithm, wherein it is getting called to 'predict' class for a new test data.
I regret that I cannot upload the exact model and loading any sample model wont do justice to the query regarding size of model.
For example, when the training data entries (for 02 classes) were around 11-12k, Ensemble Bagged Tree model was ~400kb. It increased in size, around 13MB, when number of entries became ~27k(for 05 classes). How can I optimize the ML model.
Is there any way to optimize the derived model function handle?

Sign in to comment.

Answers (1)

Aditya Patil
Aditya Patil on 12 Jul 2021
The size of the model depends on the number of parameters required to define it. Due to their nature, ensembles in general, and forests in specific require lot of parameters.
There are two workarounds,
  1. You can use other models that are defined using much smaller number of parameters, say SVMs.
  2. If you want to use ensembles and forests, then you can reduce the number of trees used, and you can reduce the number of leaves in a tree. This will however come at the cost of accuracy.
You should continue to use compact models irrespective of above workarounds. I would not recommend trying to implement the code for the model, as that is unlikely to give any significant improvements over the model.

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!