neural network back propagation problem

Question

azie on 12 Jun 2013

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/78784-neural-network-back-propagation-problem

im using 2 inputs and single output. then the same network structure apply for 3 inputs and two outputs. however, i dont get too near output value. whats wrong with this network? or i need to change it to other type of structure?

clear all;clc;clear;

% load data % p=[0 0 1 1; 0 1 0 1]; % t = [0 1 1 0];

p = [0 0 0 0 1 1 1 1; 0 0 1 1 0 0 1 1; 0 1 0 1 0 1 0 1]; t = [0 1 0 0 0 1 1 1; 0 1 0 0 1 1 0 0];

net = newff(p,t,[15, 15],{'logsig','logsig'},'traingd');

net.trainParam.perf = 'mse'; net.trainParam.epochs = 100; net.trainParam.goal = 0; net.trainParam.lr = 0.9; net.trainParam.mc = 0.95; net.trainParam.min_grad = 0;

[net,tr] = train(net,p,t);

y=sim (net,p)'

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 16 Jun 2013

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/78784-neural-network-back-propagation-problem#answer_88932

Open in MATLAB Online

% Ntrn/Nval/Ntest = 7/0/1

 close all, clear all, clc
 tic
 ptrn = [0 0 0 0 1 1 1 ; 0 0 1 1 0 0 1 ; 0 1 0 1 0 1 0 ]
 ttrn = [0 1 0 0 0 1 1 ; 0 1 0 0 1 1 0 ]
 ptst = [ 1; 1; 1 ]
 ttst  = [ 1; 0 ]
 [I Ntrn] = size (ptrn)         % [ 3 7 ]
 [O Ntrn] = size (ttrn)         % [ 2 7 ]
 Ntrneq   = prod(size(ttrn))    % 14
 MSEtrn00 = mean(var(ttrn',1))  % 0.2449
 [I Ntst] = size(ptst)          % [ 3 1 ]

% Nw = (I+1)*H+(H+1)*O = O +(I+O+1)*H < Ntrneq

 Hub = -1 + ceil( (Ntrneq-O) / (I+O+1))   %  1
 Nwub = O+(I+O+1)*Hub                     % 8 < 14
 Hmax = 3
 dH=1
 Hmin =0
 Ntrials = 20
 MSEgoal =  0.01*MSEtrn00   % 2.4e-3   => R2trn >= 0.99
 MinGrad =   MSEgoal/10     % 2.4e-4
 rng(0)
 j=0
 for h = Hmin:dH:Hmax
    j=j+1
    if h==0
        net = newff(ptrn,ttrn,[]);
        Nw = (I+1)*O
    else
        net = newff(ptrn,ttrn,h);
        Nw = (I+1)*h+(h+1)*O
    end
    Ndof = Ntrneq-Nw
    net.divideFcn           = 'dividetrain';
    net.trainParam.goal     = MSEgoal;
    net.trainParam.min_grad = MinGrad;
   for i = 1:Ntrials
        h      = h
        ntrial = i
        net    = configure(net,ptrn,ttrn);
        [ net tr Ytrn  ] = train(net,ptrn,ttrn);
        ytrn       = round(Ytrn)
        MSEtrn     = mse(ttrn-ytrn)
        R2trn(i,j) = 1-MSEtrn/MSEtrn00;
        Ytst       = net(ptst)
        ytst1(i,j) = round(Ytst(1));
        ytst2(i,j) = round(Ytst(2));
    end
 end
 H     = Hmin:dH:Hmax
 R2trn = R2trn
 ytst1 = ytst1
 ytst2 = ytst2

toc % 26 sec

% Training Summary:

R2trn > 0.71 only if the net is overfit (H=2, 3)
When R2trn > 0.71, R^2 = 1  (MEMORIZATION)
R2trn = 1  50% of the time when H = 2  and 90% 
   of the time when H = 3
When H=0 (linear), max(R2trn) = 0.71  25% of the time
When H =1, max(R2trn) = 0.42    60% of the time

% Generalization Summary

 1. ytst(1)  vs ttst(1)=1
    When H =0:3, the corresponding number of errors are
     [ 0   5  10  13 ]              
 2. ytst(2)  vs ttst(2)=0
    When H =0:3, the corresponding number of errors are
     [ 11  17 14  12 ]

3 Comments
Show 1 older commentHide 1 older comment

Greg Heath on 10 Jul 2013

You don't seem to understand the following basic assumptions>

To expect a net to generalize to the complete data set.

1. The training set must adequately characterize the complete data set.

2. If overfitting precautions ( Ntrneq >> Nw or Nval >> 1 or trainFcn = 'trainbr') are not made, the net can memorize the training set but perform poorly for nontraining data.

Go to the comp.ai.neural-nets FAQ and search on

Generalization

Overfitting

Hope this helps.

If not, please respond with more questions.

Greg

azie on 19 Jul 2013

u mean that my data either not in a complete set or the network is overfitting? therefore i dont get the good result in nontraining data, is it? but i have done all the steps to prevent overfit, just dont know whether the experimental data is enough to cover everything or not.

Sign in to comment.

Answer 2

Greg Heath on 13 Jun 2013

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/78784-neural-network-back-propagation-problem#answer_88629

Why don't you just use the code in help newff?

Note that you have a 3-15-15-2 node topology with

Nw = (3+1)*15+(15+1)*15+(15+1)*2 = 332 Unknown weights

Ntrn = 8 - 2*round(0.15*8) = 6 training patterns

Ntrneq = Ntrn*2 = 12 training equations

If 12 equations for 332 unknowns makes you uneasy, remove one of the hidden layers and remove some of the hidden nodes from the remaining hidden layer.

Hope this helps.

Thank you for formally accepting my answer

Greg

6 Comments
Show 4 older commentsHide 4 older comments

Greg Heath on 13 Jun 2013

P.S. If you have patternnet, then newfit, newpr and newff are obsolete.

They should be replaced by fitnet, patternnet and feedforwardnet, respectively.

Use fitnet for regression and curve-fitting.

Use patternnet for classification and pattern recognition.

There is no reason to use feedforward net. It is called automatically by fitnet and patternnet.

azie on 14 Jun 2013

Edited: azie on 14 Jun 2013

Open in MATLAB Online

Dear Greg,

    %%modified code
   p = [0 0 0 0 1 1 1 ; 0 0 1 1 0 0 1 ; 0 1 0 1 0 1 0 ]; 
  t = [0 1 0 0 0 1 1 ; 0 1 0 0 1 1 0 ];
[I N] = size (p);
[O N] = size (t);
net = newff(p,t,[3,3],{'logsig','logsig'},'trainlm');
net.divideFcn = '';
net.trainParam.perf = 'mse';
net.trainParam.epochs = 500;
net.trainParam.goal = 0;
net.trainParam.lr = 0.9;
net.trainParam.mc = 0.95;
net.trainParam.min_grad = 0;
net = init(net);
[net,tr] = train(net,p,t);
y=sim (net,p)'
j=[1 ; 1; 1];%suppose result=[ 1;0 ]
y=sim (net,j)'

a) im trying to change from two layers to one layer but there is funny result i get for the output. all the lowest output become not less than 0.5. but if im using 2 layers, the result become exactly like the target which i think is correct. So, that why im stay with 2 layer.

b) yes, im reducing the number of hidden neuron like u told me to. im shocked to know that even 2 neuron in both layers can give the same exact result as target. is it acceptable or not?

c)like i said before, im trying to predict the j value of input, even though the training session went well with almost zero error. but the network is still worst in predicting new input.what should i do?

Sign in to comment.

Answer 3

Greg Heath on 15 Jun 2013

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/78784-neural-network-back-propagation-problem#answer_88868

Open in MATLAB Online

1. You mean a net with 2 HIDDEN layers. The unmodified term "layers" means hidden AND output layers.

In the last 30 years of designing NNs, I have never encountered a net that needed 2 hidden layers. Nets with 1 hidden layer can be universal approximators if they have enough hidden nodes. Universal approximators tend to interpolate well at the expense of extrapolating badly, especially if they have too many hidden nodes.

2. If you look at the code in help newff and doc newff, you will see that you don't need to specify a long list of net properties. Always try the defaults first. They are usually sufficient.

3. Since the default and alternative input normalizations (mapminmax and mapstd) tend to center the data, 'tansig', NOT 'logsig' is the best choice for a MLP hidden layer transfer function.

4. Overfitting/Overtraining/Generalization

 Ntrneq = prod(size(t)) =7*2 = 14         % Training Equations
 Nw     =  (3+1)*3+(3+1)*3+(3+1)*2 =  32  % Unknown weights 
 Nw     > Ntrneq                          % OVERFITTING

None of the following conditions are satisfied

 Ntrneq >> Nw              % Overfitting mitigation
 Nval >> 1                 % Overtraining mitigation via validation stopping
 net.trainFcn = 'trainbr'  % Overtraining mitigation via regularization

Consequently you have an over-trained over-fit net that is not expected to generalize well.

I have no idea what deterministic transformation the data is supposed to represent. Therefore, it is difficult to evaluate a single net with non-design data to see if it is any good (i.e., can generalize ).

The original data represented the 8 corners of a 3-D cube. If the target for all 8 corners is known, the generalization capabilty could be tested via Leave-one-out cross-validation where eight nets are designed with 7 corners and tested on the eighth corner.

However, if you visualize a 3-D cube, notice that any corner can be considered to be an OUTLIER with respect to the other 7. Therefore, it would not be surprising if a net designed with 7 corners could not extrapolate well to the eighth corner.

An interesting demonstration would be to vary the number of hidden nodes from H = 0 to a value BEYOND the upper bound value H=Hub, where the number of unknown weights is greater than the number of training equations.

To mitigate the existence of bad random weight configurations, design Ntrials = 10 nets for each value of H from 0 to Hmax (numH = Hmax+1). Since N=8 and H are small, the N*numH*Ntrials = 80* numH designs can probably be designed in less than 5 or 10 minutes.

 [ I Ntrn ] = size(ptrn)               % [ 3 7 ]
 [ O Ntrn ] = size(ttrn)               % [ 2 7]
 Ntrneq = prod(size(ttrn))             %  14
 [ I Ntst] = size(ptst)                % [ 3 1 ]
 [ O Ntst ] = size(ttst)               % [ 2  1 ]
 % Nw = (I+1)*H+(H+1)*O = O +(I+O+1)*H
 Hub = -1 + ceil( (Ntrneq-O) / (I+O+1))   %  1
 Hmin = 0, dH = 1, Hmax = 3  % Choose numH = 4, Ndesigns = 320

More Later

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 4

azie on 10 Jul 2013

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/78784-neural-network-back-propagation-problem#answer_91331

still searching for an answer. accepting your code and run it. however, it seem like 1- the epoch usually runs not more than 30epochs.is it okay? the performance goal met and sometime exceedthe Mu. is this will produce good result later on?

2-the prediction value is far from target with large error founded.

so,any suggestion?

1 Comment
Show -1 older commentsHide -1 older comments

Greg Heath on 22 Jul 2013

Your problem is not suitable as a regression or a classification problem where a model designed with a subset of the data can generalize to the rest of the data.

All you have to do is visualize the cube in 3 dimensions. None of the points are characterized by the other 7 points.

Sign in to comment.

neural network back propagation problem

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (3)

6 Comments
Show 4 older commentsHide 4 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

neural network back propagation problem

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (3)

6 Comments Show 4 older commentsHide 4 older comments

0 Comments Show -2 older commentsHide -2 older comments

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment

6 Comments
Show 4 older commentsHide 4 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments