the difference of best validation point and final point at training process plot

5 views (last 30 days)
I trained a network with trainNetwork function like below.
[net, info] = trainNetwork(Xtrain, Ytrain, lgraph, options);
And the options were like below.
options = trainingOptions('adam', ...
'InitialLearnRate', 5e-06, ...
'MaxEpochs', 30, ...
'MiniBatchSize', 128,...
'ExecutionEnvironment', 'multi-gpu', ...
'ValidationData',{Xvalid,Yvalid}, ...
'ValidationFrequency', 10, ...
'ValidationPatience', inf, ...
'Shuffle', 'every-epoch', ...
'OutputNetwork', 'best-validation', ...
'Plots','training-progress' )
In this case, the 'final' point is displayed on the graph after learning is completed. However, the final point is very different from the graph's validation accuracy and validation loss. See below figure.
I don't understand this part. You can clearly see that the validation line marked final on the graph has an accuracy of between 85 and 90%, and the final point is below 80%. In the upper right corner of the figure, the validation accuracy is 76.8%.
Is this happening because of some option setting?
I am waiting for help from experts to find out why this is happening.
Please help.
  2 Comments
James
James on 4 Sep 2023
It is common for test accuracy to be slighty lower than that of validation accuracy as the model can choose of multiple validation results for best performing model; this does not 100% guarantee performance in test dataset. That said, even with that the results above seem a bit out of norm.
There can be few explanations for this.
If model is overfitting
If BatchNormalization is used,
It'd be also good if you could share how train/validation/test datasets are prepared and how model is designed ('lgraph'). For example, if K-fold is used for validation, disparity between test accuracy will be quite large.
Yongwon Jang
Yongwon Jang on 8 Sep 2023
Thank you very much for your reply.
I am doing 6-fold validation. 5-fold data was allocated as training, and 1-fold data was allocated as validation data.
The question is the result of setting the 1st to 5th data groups to tr and the 6th data group to vl. (To do it all with 6-FOLD, I have to do this 5 more times, changing the VL)
The data divided into 6 groups will clearly have differences as they are experimental data from different environments. However, I don't understand why validation accuracy is so different from final accuracy.
There is something I don't quite understand at first glance, so I would like to ask you the following questions.
First, I am writing down the facts I know below.
1. Training accuracy is the result of learning by dividing data into batch size at each iteration.
2. (Divided data equal to the batch size is used for learning) At every 10th iteration, the entire validation data is entered to measure validation accuracy.
3. After completing the designated epoch, the final point is displayed as a result of the forward net that inputs validation data into the net selected as best-validation.
If so, shouldn't the point marked final be placed in the same place as the validation accuracy? If the program uses the same validation data as the black dot taken every 10th iteration as validation accuracy, it should be taken in the same place, right? Maybe I'm misunderstanding this?
I am curious about what data is used to measure the final accuracy of the data marked final at the end of training.
I additionally tested as follows... I confirmed that as the batch size increases, the difference (between the position of the point marked as final and the validation accuracy point) decreases. Is it related to batch size?
If necessary, I will share the process of dividing the data into 6 groups.
thank you.

Sign in to comment.

Answers (1)

Gagan Agarwal
Gagan Agarwal on 5 Sep 2023
Hi Yongwon Jang
The plot depicts a decline in validation accuracy after the training on the final iteration of the data, and due to the default setting of the OutputNetwork training option as last-iteration, the Validation Accuracy field is being recorded as 76.8%.
The OutputNetwork training option is not correctly assigned in the ‘option’ variable.
To obtain the Best Validation Loss as the Validation Accuracy’, it is recommended to set the OutputNetwork option to best-validation-loss' rather than best-validation.
For a more comprehensive understanding of various optional parameters, you can refer to the following documentation: - https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html
  1 Comment
Yongwon Jang
Yongwon Jang on 8 Sep 2023
‘best-validation-loss' rather than ‘best-validation.’
I appreciate your pointing it out. I corrected it and ran the code again, but still I wonder the difference between final point and validation accuracy.
I think.... after completing the designated epoch, the final point is displayed as a result of the forward net that inputs validation data into the net selected as best-validation-loss. My guess is... if the selected net of training is used for both validation accuracy process and final accuracy process, the final point should be the same as validation accuracy point. But, the plot shows still difference (smaller than before)
Please let me konw if I misunderstand something.
Again, thank you for your help.

Sign in to comment.

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!