computing SVD of very large matrix

20 views (last 30 days)
Saurabh
Saurabh on 25 Mar 2012
Commented: Fernando Zigunov on 26 May 2021
Hi, I have a very large matrix 1.1 million rows and 1100 columns. I want to compute SVD of this matrix. It is not possible to load the entire matrix into memory. So how do I do this ?
  3 Comments
mohammad master
mohammad master on 21 Jul 2017
Hi, I have your problem too. I want to reduce dimension of my data which is a 350000 * 800000 matrix. Each row is a sample. The matrix itself does not fit to memory and I don't know how to do PCA or SVD on my data. Can any body help please?
John D'Errico
John D'Errico on 21 Jul 2017
Sorry. It matters not how big and fast computers get. There will always be somebody who wants to compute the SVD of a matrix orders of magnitude larger than anything they can compute. (Other computations too are common here.) It is trivial to create immensely huge problems to solve. That does not make them solvable.
In Mohamed's case, the matrix requires something like
350000 * 800000*8/1024/1024/1024/1024
ans =
2.03726813197136
So 2 terabytes of memory, just to store the matrix. Computing the SVD will take more.
Just wanting to solve a huge problem does not make it solvable. Your computer does not have infinite capability. Yes, the problem can be solved. You just need to find a seriously large computer. It would help if you work for the NSA, or some similar organization. Find someone who won't care if you use their hardware for your personal research tasks. Or you can always pay for time, to Amazon or Microsoft.

Sign in to comment.

Answers (2)

Christine Tobler
Christine Tobler on 21 Jul 2017
The simplest way may be to use a computer with more memory - a matrix of size 1100000 x 1100 takes about 9GB of memory, which is doable nowadays.
If this is not an option, since the matrix has many more rows than columns, you may be able to read the matrix into memory one block at a time, compute the QR decomposition of each individual block, and then do the SVD of the combination of these. Do you need the singular vectors too, or only the singular values?
Here's an example: Let's say you have matrices A1 and A2 (which both have many more rows than columns), and you want to compute the SVD of [A1; A2].
% Load A1
[Q1, R1] = qr(A1, 0);
% Discard Q1 and A1
% Load A2
[Q2, R2] = qr(A2, 0);
% Discard Q2 and A2
% Just singular values
s = svd([R1; R2]);
% Right singular vectors too
[~, S, V] = svd([R1; R2], 'econ');
% Left singular vectors of A would have same size as A, so they do not fit in memory
I'm afraid this trick would not work for mohammad master's case, since that matrix has both many rows and many columns. Since that matrix would take about 2 terabytes of memory, it is also too large to fit into memory of any reasonably available machine, right now.
  1 Comment
Fernando Zigunov
Fernando Zigunov on 26 May 2021
Christine, I'm not sure this is correct. I tried implementing your code and it does not match against the ground truth. Try the code below:
clear; clc; close all;
A=rand(300,20); %"Large" matrix to perform SVD on
A1=A(:,1:10); %Splits large matrix into two matrices
A2=A(:,11:20);
[Q1, R1]=qr(A1,0); %Performs QR on each matrix
[Q2, R2]=qr(A2,0);
R=[R1; R2]; %Joins the R matrices
[~, s, v] = svd(R, 'econ'); %SVD of the R matrix only
[U, S, V]=svd(A, 'econ'); %full SVD of the "large matrix"
%Compare eigenvalues
figure;
plot(diag(s),'k-'); hold; plot(diag(S),'b-')
%Compare eigenvectors
figure;
subplot(1,2,1); imagesc(v);
subplot(1,2,2); imagesc(V(1:10,1:10));
Neither the eigenvalues, nor the right eigenvectors are matching against the ground truth.

Sign in to comment.


Tom Wenseleers
Tom Wenseleers on 25 Jun 2018
In R I think it's possible to calculate the truncated SVD from a very large matrix stored out of memory (ie stored as a disk backed file) using the irlba and the bigmemory and bigalgebra packages, see https://stackoverflow.com/questions/17352771/svd-of-very-large-matrix-in-r-program Not sure it will run in reasonable time for such huge problems though...

Categories

Find more on Linear Algebra in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!