CNN for regression with imageDatastore

Hi, I am facing a CNN regression problem. I have a datastore with 41000 images and the images are 5x16000x1. The task is similar to the matlab example "Train Convolutional Neural Network for Regression" but, instead of angle of rotation, each image as a specific distance associated (for example I have 7000 images with the distance associated equal to 5mm; 7000 images with the distance associated equal to 3mm; 5000 images with the distance associated equal to 7mm; 5000 images with the distance associated equal to 9mm; ecc ecc). I have followed the example indicated, so I created a 4d array of dimension 5x16000x1x41000 and a 41000x1 vector. But the dimensions are very large, so it's not feasible this way. I would like to know how to use in the correct way the imageDatastore for my problem, since when I try to use it this message appear to me: "imageDatastore is not supported for regression tasks".
Thank you for your help in advance!

1 Comment

Ola Ola
Ola Ola on 26 Dec 2022
Edited: Ola Ola on 26 Dec 2022
@Daniele Minotti I have a similar problem although with smaller amount of data. Please how did you create 4d array for your images?

Sign in to comment.

 Accepted Answer

You can create a combined datastore to hold the images and scalar distances and pass this into the training routine for the CNN regression task. As an explicit example, run the CNN regression task, "Train Convolutional Neural Network for Regression", from the documentation example - https://uk.mathworks.com/help/deeplearning/ug/train-a-convolutional-neural-network-for-regression.html.
Then save each digit image and corresponding angle of rotation to a "TrainingData" and "ResponseData" directory respectively. In your case, it appears as though you already have the images saved to disk in some format and so these steps can be modified for your use case. This snippet saves the digit images and angles of rotation as MAT files:
for observation = 1:size(XTrain,4)
X = XTrain(:,:,:,observation);
save(['TrainingData/X_' num2str(observation) '.mat'],'X');
X = YTrain(observation);
save(['ResponseData/Y_' num2str(observation) '.mat'],'X');
end
Next create imageDatastores, loading in the images and responses, and then combine these datastores.
imds = imageDatastore('TrainingData','FileExtensions','.mat','ReadFcn',@matImgRead);
rds = imageDatastore('ResponseData','FileExtensions','.mat','ReadFcn',@matImgRead);
ds = combine(imds,rds);
function data = matImgRead(filename)
img = load(filename);
data = img.X;
end
You then can train the regression CNN using this combined datastore.
net = trainNetwork(ds,layers,options);
The documentation for the imageDatastores give more detail where data is saved in different formats, for example if your images are ".jpg" - https://uk.mathworks.com/help/matlab/ref/matlab.io.datastore.imagedatastore.html

5 Comments

Thank you for the answer, but I have a doubt. What about validation and test data? It seems from this code that is just sufficient for a regressive CNN to have the training and the response data. Maybe what I am asking is trivial, but I am quite new to this argument and I'd like to "learn" correct. Thank you in advance.
Validation and test data can be turned into datastores in the same way. If instead you want to split your original data into training and validation for example, with 80% training and 20% validation, you could create a training datastore and validation datastore in the following way (assuming you have run the previous code snippet and "Train Convolutional Neural Network for Regression" example:
perm = randperm(size(XTrain,4));
trainValidateSplit = 0.8;
imdsTrain = imageDatastore(imds.Files(perm(1:trainValidateSplit*end)),'FileExtensions','.mat','ReadFcn',@matImgRead);
imdsValidate = imageDatastore(imds.Files(perm(trainValidateSplit*end+1:end)),'FileExtensions','.mat','ReadFcn',@matImgRead);
rdsTrain = imageDatastore(rds.Files(perm(1:trainValidateSplit*end)),'FileExtensions','.mat','ReadFcn',@matImgRead);
rdsValidate = imageDatastore(rds.Files(perm(trainValidateSplit*end+1:end)),'FileExtensions','.mat','ReadFcn',@matImgRead);
dsTrain = combine(imdsTrain,rdsTrain);
dsValidate = combine(imdsValidate,rdsValidate);
This creates a random permutation of the data and allocates 80% to training (and response) datastores, the remaining 20% to validation (and response) datastores. You can split three ways to add test data by extending this idea.
You can update the training options to reference this validation datastore, dsValidate, and then train on the dsTrain datastore.
options = trainingOptions('sgdm', ...
'MiniBatchSize',miniBatchSize, ...
'MaxEpochs',30, ...
'InitialLearnRate',1e-3, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',20, ...
'Shuffle','every-epoch', ...
'ValidationData',dsValidate, ...
'ValidationFrequency',validationFrequency, ...
'Plots','training-progress', ...
'Verbose',false);
net = trainNetwork(dsTrain,layers,options);
Thank you very much!
@Antoni Woss I am new to MATLAB. I have tried to follow through this snippet you gave but I am have error "Cannot create 'X_1.mat' because 'TrainingData' does not exist.". My images are in a folder (F:\Regression\Images), please I am wondering in the snippet, how do I create XTrain and make MATLAB TrainingData?
for observation = 1:size(XTrain,4)
X = XTrain(:,:,:,observation);
save(['TrainingData/X_' num2str(observation) '.mat'],'X');
X = YTrain(observation);
save(['ResponseData/Y_' num2str(observation) '.mat'],'X');
end
In order to use this workflow with your existing images, you would need to modify the datastore and read function accordingly:
imds = imageDatastore('TrainingData','FileExtensions','.mat','ReadFcn',@matImgRead);
rds = imageDatastore('ResponseData','FileExtensions','.mat','ReadFcn',@matImgRead);
ds = combine(imds,rds);
function data = matImgRead(filename)
img = load(filename);
data = img.X;
end
For example, if your training data is in "F:\Regression\Images", then you need to point to that location. See for example: https://www.mathworks.com/help/matlab/ref/matlab.io.datastore.imagedatastore.html.
You would also need to adjust the matImgRead function to read the correct filenames and point to the data within the file. In the example above, the files were named "X_1.mat" for example (iterating the number for each image). Within the MAT file, the data was saved in the "X" variable.

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!