Bad results in modeling systems, with more than 1 Input, using neural network!

I'm getting started with the NN tool box for modeling systems with time delays so I started with an example which goal is to identify the following relation: y(t)=exp(x(t-2))-3*x(t-1).
This is the program I used to find a NN which can simulate the relation y(t) = F( x(t-1), x(t-2) ) :
1. First I create the Input and Output for training
xt=rand(1,100);
for i=3:100 yt(i)=exp(xt(i-2))-3*xt(i-1); end
2. Then I train Hmax*Niter networks
rng(0)
inputSeries = tonndata(xt',false,false);
targetSeries = tonndata(yt',false,false);
Hmax=10;
Niter = 10;
for i = 1:Hmax for j = 1:Niter
inputDelays = 1:2;
hiddenLayerSize = i;
net = timedelaynet(inputDelays,hiddenLayerSize);
[inputs,inputStates,layerStates,targets] = preparets(net,inputSeries,targetSeries);
net.divideFcn = '';
net.trainFcn = 'trainbr';
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
end
end
3. I select the best network with R2 close to 1.
4. I evaluate the network with unseen data.
xe=rand(1,100);
for i=3:100 ye(i)=exp(xe(i-2))-3*xe(i-1); end
inputSeries = tonndata(xe',false,false);
targetSeries = tonndata(ye',false,false);
[inputs,inputStates,layerStates,targets] = preparets(net,inputSeries,targetSeries);
outputs = net(inputs,inputStates,layerStates);
P=cell2mat(outputs);
O=cell2mat(targets);
plot(P)
hold on
plot(O,'r')
Here is the results: I used 2 for ID because I know it in advance anyway, I found a lot of networks with R2 close to 1, go to the following figure for the training data results.
for unseen data :
xe=3*rand(1,100);
for i=3:100 ye(i)=exp(xe(i-2))-3*xe(i-1); end
for input so large like xe= 5 *rand(1,100) the network model give bad results which is normal.
Unfortunately I can't find the same results when I try to identify the following relation with 2 inputs y(t)=w(t-1)*w(t-2)*exp(x(t-2))-3*x(t-1). In fact for the training data I get R2 close to 1 like the first equation but for unseen data in the same range of the training one ( xe=rand(1,100); we=rand(1,100);) the predicted values don't match the actual ones.
I evaluated the net using
inputSeries = tonndata([xe;we]',false,false);
targetSeries = tonndata(ye',false,false);
I'm sure that if I would identify another equation with more than 2 inputs and outputs I'll have bad results too. Could someone help me with this issue? It would be so helpfull if you post your code which allow you to have good results for more than 1 inputs.
Thanks in advance.

 Accepted Answer

Violations of basic assumptions:
1. All input data are assumed to have been drawn from the same source. Violated by xt = rand(1,100) and xe = 3*rand(1,100)
2. ID = [ 1,2 ] contains lags at which there are significant cross-correlations between input and output.
zx = zscore(x,1);
zw = zscore(w,1);
zy = zscore(y,1);
lags = -(N-1):N-1
xcorryx = nncorr(zy,zx,N-1,'biased');
xcorryw = nncorr(zy,zw,N-1,'biased');
Are xcorryx(N+1:N+2) and xcorryw(N+1:N+2) significant?
Hope this helps.
Thank you for formally accepting my answer
Greg

5 Comments

  • Dear Greg what do you mean by the same source ?
  • For the first network, 1 input with 2 ID, 3 hidden neurones and 1 output, training was done using xt=rand(1,100) and testing with xe=rand(1,100) different from xe (obviously), nevertheless I got good results see the figures above. The network does not work for large input. In fact, elements of rand(1,100) are all in [0,1] but those of xe= A *rand(1,100) are in [0,A] and I test a lot of A values and I can conclude that for A<4 the network predictions are good.
  • For the second network with 2 inputs, the network perform very well R² very close to 1 for the training data. But for unseen ones even in the range [0,1] the network can not predict well, you can see figure above. Why the network is bad when I use 2 inputs knowing that for the first network with 1 inputs it was OK?
  • Tell me if you need more information to answer me efficiently.
1. Dear Greg what do you mean by the same source ?
Can be assumed to be random draws from the same probability distributon function.
2. For the first network, 1 input with 2 ID, 3 hidden neurones and 1 output, training was done using xt=rand(1,100) and testing with xe = rand(1,100) different from xe (obviously), nevertheless I got good results see the figures above.
No violation.
3. The network does not work for large input. In fact, elements of rand(1,100) are all in [0,1] but those of xe = A *rand(1,100) are in [0,A] and I test a lot of A values and I can conclude that for A<4 the network predictions are good.
Graceful degradation as basic assumptions are violated is what we all hope for (but do not always get).
4. For the second network with 2 inputs, the network perform very well R² very close to 1 for the training data. But for unseen ones even in the range [0,1] the network can not predict well, you can see figure above. Why the network is bad when I use 2 inputs knowing that for the first network with 1 inputs it was OK?
Will investigate.
The problem with the 2-D model is that the training data does not sufficiently characterize the 2-D x-w input space. If you sort x and rearrange w accordingly, you just have a monotonic curve on the unit square.
In order to characterize the unit square better choose (x,w) from the unit square. You can plot the points to make sure the square is sufficiently covered.
I would choose the points from a uniform grid and vary the number of points until you get a decent answer.
Then I would choose that number of points randomly from the square.
Hope this helps.
Greg
Relevant comments, thank you.
I have another question in relation with that. One time you told to me that when we train a NARX NN with an input vector X the model developed is reliable just for unseen data inputs which have the same mean and variance approximately as X.
For a given unseen Input data I and using this expression Y = STD(X)/STD(I)* (I-mean(I))+mean(X) the Y data have the same mean and variance as X (used for training) nevertheless the network can not predict Output from Input correctly. I create a NARX model and trained it using sin(2pi/20*t) (t is time) but when I test it using sin(2pi/20 *rand(1,1000)) after being preprocessed to have the same mean and variance as sin(2pi/20*t) The performance are bad but I noted that predicted Output fluctuations are the same as actual Output. They seem to be just translated vertically without time delay.
Could You please tell me what is the problem with this issue?
For time series the significant delays have to be approximately the same in addition to the mean and variance.

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Asked:

on 28 May 2013

Commented:

on 20 Feb 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!