help needed with fraction of unexplained variance
4 views (last 30 days)
Show older comments
Hello,
I have a data set that has 100000 observations and 200 variables. I have used PCA and luckily only 5 dimensions capture ~95% of the variance. I got this quantity from the eigenvalues of the covariance matrix.
Just to check that everything makes sense I have also calculated the fraction of unexplained variance as defined here: http://en.wikipedia.org/wiki/Fraction_of_variance_unexplained
To my surprise I got results that are inconsistent with the what I obtained from the eigenvalues of the covariance matrix. My question is how the two quantity relates to each other? I want to test some other dimension reducing techniques on the data as well and would be interested in making an apple-to-apple comparison among them in terms of the percentage of the original data retained.
Any help would be very much appreciated!
Best regards, Blaise
0 Comments
Answers (1)
Aditya
on 4 Feb 2025
Hi Blaise,
When you perform Principal Component Analysis (PCA), you typically assess how much of the total variance in your data is captured by the principal components. This is done by examining the eigenvalues of the covariance matrix, where each eigenvalue represents the variance captured by each principal component.
Here are steps to verify calculations:
% Assuming 'data' is your dataset (100000 x 200)
[coeff, score, latent] = pca(data);
% Calculate the explained variance
explainedVariance = cumsum(latent) / sum(latent);
% Check how much variance is captured by the first 5 components
varianceCaptured = explainedVariance(5);
% Reconstruct data using the first 5 principal components
reconstructedData = score(:, 1:5) * coeff(:, 1:5)';
% Calculate residuals
residuals = data - reconstructedData;
% Calculate FVU
varianceOriginal = var(data(:));
varianceResiduals = var(residuals(:));
fvu = varianceResiduals / varianceOriginal;
% Display results
fprintf('Variance captured by first 5 components: %.2f%%\n', varianceCaptured * 100);
fprintf('Fraction of Variance Unexplained (FVU): %.2f%%\n', fvu * 100);
0 Comments
See Also
Categories
Find more on Dimensionality Reduction and Feature Extraction in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!