Clear Filters
Clear Filters

Why is my coeff variable blank when I run PCA?

9 views (last 30 days)
Kelly Thompson
Kelly Thompson on 11 May 2019
Answered: Paras Gupta on 19 Jul 2024 at 9:06
So I would like to conduct PCA on a dataset I have that is of sample size 23 and feature size 813 (nxp = 23x813). However, when I run the pca function, the coeff variable returned is blank. why is this the case? I am aware my sample size: feature size ratio is very low, but the function returned a 813x22 coefficient matrix when I ran it on my data before I processed it (transformed nonnormally distributed data & feature scaling) so I know it is not because of unsuitable data size.
I have attached a copy of the data I am trying to use if anyone is able to help; thanks!
[coeff, ~, latent] = pca(deltaEBC);
UPDATE: I was feature scaling by taking feature X, subtracting the mean of X, and dividing by the standard deviation of X. I changed this to simply subtracting the mean and not dividing by the std, and now the pca function does return a 813 x 22 coeff I was looking for. I just don't know why?

Answers (1)

Paras Gupta
Paras Gupta on 19 Jul 2024 at 9:06
Hi Kelly,
I understand that you encountered an issue with the "pca" function in MATLAB where the "coeff" variable returned was blank.
This issue is due to the presence of NaN columns in your dataset, where all rows in certain columns contain NaN.
By default, the "pca" function in MATLAB performs the action specified by the 'Rows','complete' name-value pair argument, which removes rows with NaN values before calculation. Please refer the following documentation link for more information:
You may need to handle the NaN columns and remove them. You can refer the documentation on the "rmmissing" function - https://www.mathworks.com/help/matlab/ref/rmmissing.html
deltaEBC_new = rmmissing(deltaEBC, 2);
However, if you do do not want to remove missing entries from your data, you can use the Alternating least squares (ALS) algorithm for PCA in matlab which better handles missing values. You can refer the folllowing link to select this algorithm - https://www.mathworks.com/help/stats/pca.html#bth9ibe-Algorithm
[coeff, ~, latent] = pca(deltaEBC, Algorithm="als");
Hope this solves the issue.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!