Reducing the number of features

Hello everyone.
I have 300 samples and each sample has 5000 features, so my matrix has 300 rows and 5000 columns.
I need to reduce the number of columns with using best features.
How can i determine the best features? What are the options that i can try to do this for getting successful classification accuracy rate?
Thanks in advance.

Answers (2)

Cris LaPierre
Cris LaPierre on 20 Mar 2021
Edited: Cris LaPierre on 20 Mar 2021
You may find this video from our Data Processing and Feature Engineering course on Coursera (and perhaps the entire course) helpful.
Image Analyst
Image Analyst on 20 Mar 2021
The simplest way might be to use PCA. A demo is attached (for a completely different situation though).

5 Comments

Thanks for your answer, but I have a question regarding to applying PCA.
I am trying to perform principal component analysis using pca() function.
When I execute the PCA command, I am returned a matrix coeff of size 5000*299.
As far as I know coeff should be square matrix.
How can i solve this problem to reduce the number of columns?
You need to have more observations than variables (features) otherwise it can't figure out what the coefficients should be.
Serra Aksoy
Serra Aksoy on 28 Mar 2021
Edited: Serra Aksoy on 28 Mar 2021
In this case, it seems i can not use the pca().
Is there any other methods that you can suggest to me for dimension reduction when the number of samples are less than the number of features?
Maybe stepwise regression?
Thank you, I'll try to apply it.

Sign in to comment.

Categories

Asked:

on 20 Mar 2021

Commented:

on 29 Mar 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!