Issues: Training CNN on LFW database.
6 views (last 30 days)
Show older comments
Working on a personal project, I am trying to learn about CNN's. I have been using the "transfered training" method to train a few CNN's on "Labeled faces in the wild" and at&t database combination, and I want to discuss the results.
I took 100 individuals LFW and all 40 from the AT&T database and used 75% for training and the rest for validation.
I also lack proper understanding in the relationship between CNN parameters and layers, so can someone please clarify it. I think you will be able to understand where I am getting confused after I explain the data I have.
I first trained Alexnet on it and I got this plot
So Alexnet has very few layers and is a small light net (even though it has alot of parameters) which is why I think it underfit the data?
I trained resnet50 on it and I get a similar result so I believe it also underfit the data? But this one flucuates and sometimes reaches 100% training accuracy, so maybe not underfit?
I also trained inceptionresnetv2 on the data and I get this result. I am not sure about what is going on here.
I wanted to take a closer look and so I trained it again and with a lower learning rate just to make sure it wasn't that. Could this be attributed to the mini batch size?
I also trained the efficientnet with this data and reached and pretty much stayed at 100% training accuracy and a constant 70% accuracy. Maybe that was overfitting or just alright?
The last ones which gave the best results was xception and densenet CNN which had 100% training accuracy and 80% validation accuracy. Densenet overfit I think but am not sure. Perhaps xception did too?
Can someone explain the data and suggest improvements please
Edit #1
I forgot to mention that the LFW database sometimes has 2 faces in a picture (very few pictures tho) and a good number of people who look similar. The validation accuracy is most likely around 80% because of that. During my testing, I figured out that in a few images, it gave an output based on the face in the background. Sometimes it couldn't distinguish between two different people who looked similar
CODE EXPLANATION
% Random translations
pixelRange = [-10 10];
scaleRange = [0.5 1.5];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, 'RandYTranslation',pixelRange,...
'RandXScale',scaleRange, 'RandYScale',scaleRange, 'RandRotation', [-45 45]);
%==========================================================================
inputSize = g.Layers(1).InputSize;
% Resize images in both Training & Validation (Different folders)
augimdsTrain = augmentedImageDatastore(inputSize(1:2),imda, ...
'DataAugmentation',imageAugmenter);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),imda2);
%==========================================================================
miniBatchSize = 20;
valFrequency = 80;
opts=trainingOptions('sgdm','InitialLearnRate',0.0004,'LearnRateSchedule', ...
'piecewise','LearnRateDropPeriod',7,'LearnRateDropFactor',0.4,'ExecutionEnvironment','gpu','WorkerLoad', 1,'Shuffle',...
'every-epoch','ValidationData',augimdsValidation, 'ValidationFrequency',valFrequency, ...
'MaxEpochs',200,'MiniBatchSize',miniBatchSize,'Plots','training-progress', 'CheckpointPath', './DCHK');
myNet1=trainNetwork(augimdsTrain,lg,opts)
This was my code when training all the networks.
I only adjusted the learning rates, and also batch sizes but that only so it works with my gpu.
The learning rates above were for alexnet.
I increased the learning rate nad drop period for the deeper nets a bit like Initialrate was 0.001 for xception net and drop period was 10.
0 Comments
Answers (1)
Jack Xiao
on 22 Feb 2021
reduce the learning rate to a smaller value such as 0.0001, try to add more data and add dropout layer, or change a little weak net.
0 Comments
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!