Saving large Tiff in MATLAB leads to corrupt file

I am saving an image matrix to a Tiff file, and have not had any trouble until recently. Lately, when I save large files, the files are unreadable both in MATLAB and other image software.
In my case, I'm saving a color image approximately 25,000 x 25,000 pixels, with a 4th layer to save alpha information. The saved file is 880 Mb with compression, but attempts to read it give the following error:
% Error using imread (line 440)
% Unable to read TIFF file "test1.tif". File is corrupt or
% image does not contain any readable strips.
Note I can still call imfinfo on the saved file and that works fine.
The problem goes away for smaller images.
I have made a minimum example code to demonstrate the problem. This example generates a file that's ~2.8 Gb, so keep that in mind before running it.
I'm running MATLAB R2020b on Windows 10.
Is there a way to save the file that works? Should I use strips or tiles? I thought Tiffs could save up to 4Gb files.
%% Save large Tiff (will be ~2.8 Gb)
% Create image "data"
L = 25000;
mat = uint8(randi(256, [L L 3]) - 1);
mat(:, :, 4) = uint8(randi(2, [L L]) - 1); % Add alpha layer
% Create tags
tags = struct;
tags.Photometric = Tiff.Photometric.RGB;
tags.BitsPerSample = 8;
tags.ImageLength = size(mat, 1);
tags.ImageWidth = size(mat, 2);
tags.SamplesPerPixel = size(mat, 3);
tags.PlanarConfiguration = Tiff.PlanarConfiguration.Chunky;
tags.Compression = Tiff.Compression.LZW;
tags.ExtraSamples = Tiff.ExtraSamples.Unspecified;
% Save Tiff
Tobj = Tiff('test1.tif', 'w');
setTag(Tobj, tags) % add tags
write(Tobj, mat) % add data
close(Tobj) % close
im1 = imread('test1.tif'); % errors :(
% Error:
% Error using imread (line 440)
% Unable to read TIFF file "test1.tif". File is corrupt or
% image does not contain any readable strips.
% Clean up
clear mat im1
%% Repeat with smaller file (will be ~400 Mb)
L = 10000;
mat = uint8(randi(256, [L L 3]) - 1);
mat(:, :, 4) = uint8(randi(2, [L L]) - 1);
% Create tags
tags = struct;
tags.Photometric = Tiff.Photometric.RGB;
tags.BitsPerSample = 8;
tags.ImageLength = size(mat, 1);
tags.ImageWidth = size(mat, 2);
tags.SamplesPerPixel = size(mat, 3);
tags.PlanarConfiguration = Tiff.PlanarConfiguration.Chunky;
tags.Compression = Tiff.Compression.LZW;
tags.ExtraSamples = Tiff.ExtraSamples.Unspecified;
Tobj = Tiff('test2.tif', 'w');
setTag(Tobj, tags) % add tags
write(Tobj, mat) % add data
close(Tobj) % close
im2 = imread('test2.tif'); % works fine!
% Clean up
clear mat im2

6 Comments

DGM
DGM on 22 May 2021
Edited: DGM on 22 May 2021
I'd love to tackle another issue with imread/imwrite, but I barely have the memory to handle the smaller example's temporary loading. Someone whose computer didn't come out of a dumpster might have to step in. If nothing else, I might try it out when I get my laptop back.
EDIT: I can't seem to replicate the error. I ran this on a machine running R2019b (linux), and it certainly appears to have read the image from disk correctly (the 2.8GB one). I don't know if the version or environment makes any difference. There do appear to be some changes to imread(), though I don't know if they're relevant. Do you have another way to test the file on disk? It might not be a problem writing the file, but a problem with imread().
Thanks for taking a look.
I have not been able to read the file in any of my image viewers either (Paint, Paint.net, Photos, Windows Photo Viewer).
Also in case it's helpful, I looked to see where imread is having the problem and it's line 48 of readtif.m:
[X, map, details] = rtifc(args);
It's an MEX file so I don't know what's happening inside.
Also, I just tried saving a constant matrix instead so the compressed file would be much smaller (150 Mb):
mat = 100 * ones([L L 3], 'uint8');
And I have the same problem with imread and other image viewers.
Just for verification, let's see that our output images are the same. I used this setup to create the constant-valued image. This compresses much smaller, since alpha is also constant-valued. I don't know if that influences this problem, but it should at least make the output repeatable.
L = 25000;
%mat = uint8(randi(256, [L L 3]) - 1);
%mat(:, :, 4) = uint8(randi(2, [L L]) - 1); % Add alpha layer
mat = 100 * ones([L L 4], 'uint8'); % all 4 channels identical
and this is the MD5 sum I get
% 989343f844fae7f00686e32aae472132 test1.tif [100 100 100 100]
I know some of imread's other support files have changed since R2019b, so I wouldn't be surprised if readtif.m did as well. If there were some change in the numeric type used for offsets, that might be a problem (25E3^2 * 4 > 2^31). I'm just guessing though. I'm not sure what else could be causing problems.
Interesting, I get the same hash: 989343f844fae7f00686e32aae472132
So it must be reading that's the problem. Yes some of what I've read about Tiff is that is should support up to 4 Gb, but that some readers may be limited to 2 Gb if they use signed integers instead of unsigned (2^31 vs 2^32 limit). So you may be right. Especially because imread works for me just below the 2 Gb limit.
I just tried to read that very same file on a different (Windows) computer with R2019a and exact same error.
So it might be a Windows issue (Tiff library?), so I don't know if Mathworks can even fix it then.
Well, sad to get no resolution, but thank you for helping to diagnose that it is a read problem. Alas
That may be part of it. I don't have access to a Windows machine that's even remotely close (R2012a on Win7), so I can't really verify.
I really never use TIFF objects like this, but is it possible to read the image without imread?
Tobj2 = Tiff('test1.tif', 'r');
mat2 = read(Tobj2);
close(Tobj2) % close
I haven't gone to test this on the other computer with the big file, but maybe it's worth a shot. Then again, for all I know, it might be using the same underlying helper files/library.
Tried it, but it errors when I call read. I think to read it otherwise I'd have to code up my own reader, so probably not worth it.
Maybe I will look for alternative file formats to use (need ones where I can save some complex metadata too).
Thank you

Sign in to comment.

 Accepted Answer

For anyone having this problem, I found another answer that was relevant:
This is not an intrinsic problem with the Tiff format, but rather some internal bug in either matlab or the library. It can be solved by using tiles or strips.
For example, adding:
tags.RowsPerStrip = 32;
or
tags.TileLength = 256;
tags.TileWidth = 256;
Keep in mind that choices here will affect save speed, though I found that it could actually speed up the process for the right strip or tile sizes.

More Answers (1)

I can reproduce the problem with Matlab 2018b. imwrite(mat, 'test.tif') is not successful also.

Categories

Find more on Data Import and Analysis in Help Center and File Exchange

Products

Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!