Command line Neural Network training stopping after 0 iterations

I'm a bit stuck. I have some fairly large files which my ANN is training from and so I am using a grid computer to process to complete the ANN training in parallel using the significant resources this computer provides, however it runs MATLAB through command line. I have tested the code in the GUI and it works when
net.trainParam.showWindow = true;
but when I use
net.trainParam.showWindow = false;
net.trainParam.showCommandLine = true;
it stops at epoch 0 with
Training Feed-Forward Neural Network with TRAINSCG.
Epoch 0/50000, Time 0.22634, Performance 0.82729/0, Gradient 0.90148/1e-06, Validation Checks 0/300
Training with TRAINSCG completed: User stop.
Once I begin the run, I am not providing any commands or touching my computer at all, it will just auto-stop.
So if I run through the GUI, it iterates fine, but I don't have this option while processing through the grid computer as I control it only via SSH, so I need the project to run as per normal without the Neural Net window.
Any help is appreciated.

5 Comments

Are you running (and possibly cancelling) any training before switching to the command line?
Also, which version of MATLAB are you using?
MATLAB R2016a. I don't believe so, at least nothing in the Neural Network. Here is the full code of the program.
%loads train and target data
load BlueTrain.csv
load Target_B.csv
I = transpose(BlueTrain);
O = transpose(Target_B);
%create ANN
net = feedforwardnet([200,50,20]);
%copy over process settings
net.inputs{1}.processFcns = {};
%output settings
net.outputs{4}.processParams{2}.ymin=0;
%point to input data
net = configure(net,I,O);
%divides data into train/validation/test
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
%chooses training function
net.trainFcn = 'trainscg';
%layer activation functions
net.layers{1}.transferFcn = 'logsig';
net.layers{2}.transferFcn = 'logsig';
net.layers{3}.transferFcn = 'logsig';
net.layers{4}.transferFcn = 'softmax';
%cost function
net.performFcn = 'crossentropy';
%add regularisation
net.performParam.regularization = 0.001;
%increases validation check fail value to 300
net.trainParam.max_fail=300;
%sets 50000 epochs
net.trainParam.epochs=50000;
%copy over weights and biases from last training
load GF3500_B_WB1_16p.csv
B1 = transpose(GF3500_B_WB1_16p(1,:));
W1 = transpose(GF3500_B_WB1_16p(2:end,:));
%copy to net
net.IW{1} = W1;
net.b{1} = B1;
load GF3500_B_WB2_16p.csv
B2 = transpose(GF3500_B_WB2_16p(1,:));
W2 = transpose(GF3500_B_WB2_16p(2:end,:));
%copy to net
net.LW{2} = W2;
net.b{2} = B2;
%start training
[net,tr] = train(net,I,O);
%creates matricies in the format required by java program and saves them
WB1outB = [transpose(net.b{1});transpose(net.IW{1})];
WB2outB = [transpose(net.b{2});transpose(net.LW{2})];
WB3outB = [transpose(net.b{3});transpose(net.LW{7})];
WB4outB = [transpose(net.b{4});transpose(net.LW{12})];
%save WB matricies
csvwrite('GF3500_B_WB1_50p.csv',WB1outB);
csvwrite('GF3500_B_WB2_50p.csv',WB2outB);
csvwrite('GF3500_B_WB3_50p.csv',WB3outB);
csvwrite('GF3500_B_WB4_50p.csv',WB4outB);
%------------------------------ End training --------------------------------------
I am able to reproduce this by starting to train a network with the GUI open, but then clicking "Stop Training" before it is done. After that, training any new network through the command line stops at epoch 0.
The behavior was corrected by either restarting MATLAB or allowing any network to complete training with the GUI open. Do either of these work for you?
So it is working fine with the GUI, but as I am using grid computing for this, I cannot run it in a GUI. I am doing it as submitting a job which launches matlab in the commandline, so it is starting a new instance of matlab in the process. I just tested it again, and the WB matricies files have been created, so something is causing it to stop early (possibly after the first iteration).
I am able to train on my own PC with the GUI open without issue, but it doesn't let me train in command line which the Grid Computer system will need. Unfortunately the data I am training it on is large, and there are 5 similar trainings which I will need to run (which the Grid Computer allows me to run in parallel) which causes running it through the GUI as not a practical option.
This is still happening in 2018b

Sign in to comment.

 Accepted Answer

In some versions of MATLAB, if a neural network is trained normally with the Training Tool GUI, the training is stopped or cancelled by the user, and then the user tries to train with command-line only output, training stops at epoch 0. I have forwarded the details of this issue to our development team so that they can investigate it further. To correct the behavior, please use one of the following workarounds:
1. Train any neural network through the GUI and allow the training to complete. This can be a simple example such as the one given in the documentation for "feedforwardnet":
https://www.mathworks.com/help/nnet/ref/feedforwardnet.html
After the training has completed, you should be able to train networks with the GUI disabled.
2. Restart MATLAB.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!