Why doesn't the sparse command accept uintx as inputs for the location vectors in the long form of the command?

When I use the sparse command in the S = sparse(i,j,s,m,n) form, with vectors of type uintx for i & j, I get an error saying that type isn't supported. But since those are indexes would the sparse command have to convert to an integer format to use them? Now to be clear I am not trying to specify the contents ie the type of the sparse martix as an integer, the documentation clearly says only double and bool
As a note of the application. I have found that my function that generates a large sparse matrix, 25-50% faster if I change a single line of code to make the index arrays integers, which is understandable since integers=faster and smaller. I would have thought this would be supported. I could typecast the arrays but that seems like a redundant step.

 Accepted Answer

SPARSE uses doubles to store index data. Why it doesn't pre-cast uintx input to doubles for you is a fair enough question, but the index data will eventually need to be converted to doubles one way or another. It is also a fair question why after 10 years only double and logical matrix entries are supported. Probably because SPARSE originated as 3rd party code, and now it's hard to change.

7 Comments

Sparse index data is not stored as doubles ... it is stored in an integer format. 32-bit integers on my 32-bit MATLAB, and I am guessing 64-bit on 64-bit MATLAB.
As it is effectively a memory offset it would have to be an integer of some type. James do you know whether it is stored as int or uint? Because to me just like it makes no sense for a double to be used as an index in the same light it makes little sense for a signed int to be used.
Again I understand that double it the default Matlab data type. So the index arguments should be allowed to be double but it should also take integer data types.
it is stored in an integer format....and I am guessing 64-bit on 64-bit MATLAB.
Doesn't look like it. From the following experiment on my 64-bit MATLAB version, it looks like a sparse matrix consumes 2/3 the memory of its [i,j,s] data when the latter are represented as doubles. That is consistent with the i,j data being stored within the sparse matrix as 32-bit something or other.
>> A=sprand(10000,10000,.03); [i,j,s]=find(A); ijs=[i,j,s]; whos A ijs
Name Size Bytes Class Attributes
A 10000x10000 47367800 double sparse
ijs 2955487x3 70931688 double
Formula for storage (not including overhead) of a sparse double matrix S assuming k bytes per integer for the index data is:
(8+k)*nnz(S) + k*(size(S,2)+1)
That is one 8-byte double and one k-byte row integer for each non-zero element, and then (number_of_columns+1) k-byte integers for the column data. So you should be able to deduce from this formula when compared to an actual sparse matrix what size integers are used for the sparse index data. After checking on Win32 and Win64 systems here are my conclusions:
Win32: 4-byte integer for index storage, 2^31-1 is largest index allowed
Win64: 8-byte integer for index storage, 2^48-1 is largest index allowed
As for the signed vs unsigned question, the mex header files show it as either an int (signed) or size_t (unsigned) depending on some settings. I don't know what all influences the settings. This only really matters if MATLAB allows indexes near the upper range into the "sign" bit (doesn't appear to based on tests) and in mex programming if one is subtracting indexes for some reason and possibly generating a negative result (which would get wrapped around to a large positive number if size_t is the underlying type).
@Nicholas: I agree that it seems reasonable that the command should accept integer type variables as index inputs.
Guys, Thank you for your response. I do think that it is important for sparse to accept the int and uint as input. For a large sparse matrix it was surprising to me how much faster I could make the generation code just by using int instead of doubles of the indexing arithmetic(to see the benefit you need to give the JIT an opportunity to optimize). And because of the strength of the type system in matlab, it was very few lines of code to make things ints.
For a large sparse matrix it was surprising to me how much faster I could make the generation code just by using int instead of doubles of the indexing arithmetic(to see the benefit you need to give the JIT an opportunity to optimize)
You mentioned this in your original posting as well, but to me at least it's not clear what you mean by the "generation code". If the matrix is generated by calling sparse() and if sparse() responds to integer inputs with an error, then how can it be "faster".

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!