Overfitting or what is the problem

I am training my NN getting good results (I think) se attached pictures, but if I test my NN for new datas results are very poor. Here is my code
x = inMatix; %19x105100 two year dataset
t = targetData; %1x105100 hist el.load
trainFcn = 'trainlm'; % Levenberg-Marquardt backpropagation.
net=feedforwardnet(20,trainFcn);
%net = fitnet(hiddenLayerSize,trainFcn);
% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivision
net.divideFcn = 'dividerand'; % Divide data randomly
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
net.trainParam.epochs = 1000;
net.trainParam.lr = 0.001;
net.performFcn = 'mse'; % Mean Squared Error
net.plotFcns = {'plotperform','plottrainstate','ploterrhist', ...
'plotregression', 'plotfit'};
% Train the Network
[net,tr] = train(net,x,t);
% Test the Network
y = net(x);
e = gsubtract(t,y);
performance = perform(net,t,y)
% Recalculate Training, Validation and Test Performance
trainTargets = t .* tr.trainMask{1};
valTargets = t .* tr.valMask{1};
testTargets = t .* tr.testMask{1};
trainPerformance = perform(net,trainTargets,y)
valPerformance = perform(net,valTargets,y)
testPerformance = perform(net,testTargets,y)

10 Comments

Greg Heath
Greg Heath on 23 Mar 2019
Edited: Greg Heath on 24 Mar 2019
You've originally posted a bunch of stuff that looks OK.
How about illustrating EXACTLY what your problem is?
And how about explaining your new unlabeled plot!!!
Greg
Sir Heath,
I think I should use NARX, to predict future electricity load but I have some troubles to getting good results. Predicted values are very bad. That plot is showing autocorrelation of my target data. My input is 18x18000 with 10 min sampling (day,month,hour,temperature,day of week etc) and target data historical load. I need to forecast for example for 24h,72h etc.
X=con2seq(inputMat);
T=con2seq(targetMat);
N = 144;
inputDelays = 1:30;
feedbackDelays = 1:30;
hiddenLayerSize = 10;
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open');
[x,xi,ai,t] = preparets(net,X,{},T);
% Setup Division of Data for Training, Validation, Testing
% The function DIVIDERAND randomly assigns target values to training,
% validation and test sets during training.
% For a list of all data division functions type: help nndivide
net.divideFcn = 'divideblock'; % Divide data randomly
% The property DIVIDEMODE set to TIMESTEP means that targets are divided
% into training, validation and test sets according to timesteps.
% For a list of data division modes type: help nntype_data_division_mode
net.divideMode = 'value'; % Divide up every value
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Choose a Performance Function
% For a list of all performance functions type: help nnperformance
% Customize performance parameters at: net.performParam
net.performFcn = 'mse'; % Mean squared error
% Choose Plot Functions
% For a list of all plot functions type: help nnplot
% Customize plot parameters at: net.plotParam
net.plotFcns = {'plotperform','plottrainstate','plotresponse', ...
'ploterrcorr', 'plotinerrcorr'};
% Train the Network
[net,tr] = train(net,x,t,xi,ai);
% Test the Network
y = net(x,xi,ai);
e = gsubtract(t,y);
performance = perform(net,t,y)
plotresponse(t,y)
ys=cell2mat(y);
You need to begin by considering the auto and crosscorrelation functions.
Then try to reduce the number of inputs.
See some of my previous posts in the NEWSGROP as well as in ANSWERS.
Greg
Matthew Clark
Matthew Clark on 25 Mar 2019
Edited: Matthew Clark on 25 Mar 2019
Mr.Heath this is how my cross corr of my output(target data electricity load) looks like, 10 min sampling
Correlation plots typically start at delay = 0 , peak at critical delay spacings and decay at large delays
Look for previous narxnet posts (NEWSGROUP & ANSWERS with correlation calculations and/or plots
greg
Mr. Heath,
that means my significant AC is at 495 time delay ?
ACc.PNG
Help me to interpret this results, please Mr. Heath
N = 2400
Neq = 2400
M = 4559
M = 4559
M = 4559
M = 4559
M = 4559
sigthresh95 = 0.0300
plt = 1
FD = 1×2
1 2
NFD = 2
LDB = 2
Ns = 2398
Nseq = 2398
Hub = 599
Hmax = 59
Hmin = 0
dH = 1
Ntrials = 10
j = 0
j = 1
Nw = 3
Ndof = 2395
num of significant lags 1758
sigthresh.PNG
10 days dataset, with 10 min sampling good predictors are at 144 distance ? it mean my delay will be 144?
dss.PNG

Sign in to comment.

Answers (0)

Asked:

on 23 Mar 2019

Commented:

on 26 Mar 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!