How to load a mat. file containing one big matrix

Hello, I am working with 23.8 GB mat. file so I can't load it with the load function. I've tried to load it with matfile function but I have found out that in only contains one variable, which is a 1x1 struct and is again too big to be loaded. I've tried to read one element of the matrix (it's 4-dimensional) and it didn't work, still the same error ("Requested 994x961x3346x1 (23.8GB) array exceeds maximum array size preference (15.8GB). This might cause MATLAB to become unresponsive.") The mat. file was generated from an external program and I think the problem might be that it was not saved with the -v7.3 flag so it doesn't support partial loading. Is there a way to resave this file with the -v7.3 flag without having to load it all first? Or do you have any other suggestions how to acces this data?

6 Comments

"Or do you have any other suggestions how to acces this data?"
Resave: you could temporarily increase the virtual memory used by your OS, LOAD, SAVE (without the scalar structure!), revert the virtual memory, and then use MATFILE as normal to access the new array without loading all into memory.
Thank you for your suggestion, I have thought about this and I wanted to avoid this solution because I thought there might be a more elegant solution to this problem that I just don't know about. But I will give it a try. Just to be clear, here is what I've typed into the command window a the response.
>> a=matfile('Mar1A_reduced_epina.mat')
a =
matlab.io.MatFile
Properties:
Properties.Source: 'C:\Users\strib\Documents\škola\stáž\work\Mineral_separation\Mar1A\Data\Reduced_data\Mar1A_reduced_epina.mat'
Properties.Writable: false
Properties.ProtectedLoading: false
ImageLabHSI: [1x1 struct]
Methods
>> a.ImageLabHSI(1,1,1,1)
Warning: The file 'C:\Users\strib\Documents\škola\stáž\work\Mineral_separation\Mar1A\Data\Reduced_data\Mar1A_reduced_epina.mat'
was saved in a format that does not support partial loading. Temporarily loading variable 'ImageLabHSI' into memory. To use
partial loading efficiently, save MAT-files with the -v7.3 flag.
> In matlab.io/MatFile/inefficientPartialLoad (line 145)
In matlab.io/MatFile/subsref (line 471)
Requested 994x961x3346x1 (23.8GB) array exceeds maximum array size preference (15.8GB). This might cause MATLAB to become
unresponsive.
"...array size preference..."
Does someone know what is that? And how to set "array size preference"?
It sounds like a setting parameter.
"And how to set "array size preference"?"
By default MATLAB uses up to 100% of memory. But recent MATLAB versions also provide the option to turn off this limit so that MATLAB will use virtual memory... slow, but I have used this a few times to get out of a tight spot.
"If you turn off the array size limit in MATLAB Workspace Preferences, attempting to create an unreasonably large array might cause MATLAB to run out of memory, or it might make MATLAB or even your computer unresponsive due to excessive memory paging (that is, moving memory pages between RAM and disk)."
Thanks. Is there any difference (performance)
  • The array size limit is unchecked?
  • Checked with slider at 100% (default)?
"Is there any difference (performance)"
As far as I can recall, if the size limit is unchecked then performance is unaffected (at least, nothing humanly noticeable) until an array larger than the installed memory is requested, then the virtual memory is used and it is slower.
But still quite useable if using an SSD.

Sign in to comment.

Answers (1)

Jan
Jan on 9 Mar 2023
Edited: Jan on 10 Mar 2023
If the variable in the MAT file has more than 2GB, it must have the v7.3 format.
You try to extract a 994x961x3346 subarray, which needs 25.57 GB for the type double (The difference to 23.8 is ([EDITED], "most likely" removed) a mixing of GB and GiB). But you explain, that the complete file has 23.8 GB also?
If the wanted submatrix does not match into your RAM, this is not a problem of the MAT file, but of an insufficient amount of installed RAM. So install more RAM or import parts of the submatrix only.

11 Comments

My intention is to import parts of the matrix only, but I can't make that work. I've posted exactly what I've typed in the command window in the comments on my question. I hope that makes my problem more clear.
This is really strange. The message tells you, that the MAT file is not in v7.3 format, but the data size of the file exceeds 2GB, which is possible in v7.3 formatted MAT files only.
If you read the link Stephen posted the limit is 2 Gb per variable, so hypothetically you could have a large file with multiple variables.
I wonder if the internal limits are such that the real limit is 2 Gb per numeric array with cells and structs being ok if each individual element was under the limit?
The message posted by the OP explains, that a "994x961x3346x1" submatrix occupies 23.8 GB already. Then the complete array must be larger.
@Jan 'You try to extract a 994x961x3346 subarray, which needs 25.57 GB for the type double (The difference to 23.8 is mostlikely a mixing of GB and GiB)'
I dont't understand the size in double in Gb is 23.8135 Gb. How do you come up to 25.57?
(994*961*3346*8)/1024^3
ans = 23.8136
For the size in Gb divide by 1000^3. G is the metric prefix in this context.
Jan
Jan on 10 Mar 2023
Edited: Jan on 10 Mar 2023
@Bruno Luong: As I've mentioned in the cited sentence, a GB is defined as 1000^3 bytes and a GiB as 1024^3 bytes. See: https://en.wikipedia.org/wiki/Gigabyte
I've removed the "most likely" in my answer.
But, Bruno, you are a computer crack and Matlab hero. You do know the definitions of GB and GiB.
Oh I see, I always though 1Gb is 1024^3 bytes. It turns out that is wrong according to the wikipedia definition. I should check before posting the comment.
Sorry about the trouble Jan.
Fine. Thanks for the explanation.
If you are talking about RAM, for historical reasons it is very common to use 1024 as the multiplier. For example 8 GB RAM is typically 8 * 1024^3 -- at least when you are talking in terms of how much RAM is installed or what the limits are for linking or program size. But if you are looking at how many kilobytes a program is using in the process monitor, there is a tendancy to slip into 1000 as the prefix.
But when you are talking about hard disks especially, 1000 has used as the multiplier for quite a number of years now; possibly not originally but the marketing department won the battle with hard disks a long time ago.

Sign in to comment.

Products

Release

R2021b

Asked:

on 9 Mar 2023

Commented:

on 10 Mar 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!