The best Ann configuration

Question

Rita on 9 Feb 2016

2
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/267380-the-best-ann-configuration

Commented: Greg Heath on 14 Feb 2016

I have run Ann for prediction with hidden nodes from 2-17 about 50 times. My question is which criteria I should rely on to select the best Ann? Should I choose R squerd of Test or Mse of the Ann or validation performance?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 11 Feb 2016

2
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/267380-the-best-ann-configuration#answer_209364

Open in MATLAB Online

My favorite technique:

 1. Accept all default parameters except for the number of hidden nodes, H. 
 2. Minimize H subject to the constraint that the degree-of-freedom adjusted 
    mean-square-error of the training data is less than 1% of the average training target variance. 
 3. Design and test Ntrials >= 10 nets for each value of H in a range less than the upper
    bound Hub (determined by not having more unknown weights Nw than training equations Ntrneq).
    The untrained nets only differ by the random trn/val/tst data division AND the random initial weights. 
 4. Rank the nets via their slightly biased performance on validation data. 
 5. Obtain unbiased performance estimates on the nets using test data. 
 6. Statistically significant differences in performance can be estimated using the standard 
    deviation of the performance estimates.

I have posted zillions of examples in both the NEWSGROUP and ANSWERS using the same notation. Therefore searching with

greg fitnet Ntrials

should dig up enough references to clarify what I have written. If not just post a comment.

Hope this helps.

Thank you for formally accepting my answer

Greg

1 Comment
Show -1 older commentsHide -1 older comments

Greg Heath on 14 Feb 2016

% 1.By saying "divide and conquer" you mean I don't need to train all 1525 data at the same time and I need to create subsets of data?

No. You have determined Hub = 106. It is ridiculous to search using H = 1:106, Ntrials = 1000. It is better to design no more than ~100 nets at a time. For example, start with h = 6:10:106, Ntrials = 10 and print NMSE in a 10 x 11 matrix to see how few hidden nodes are needed obtain NMSE <= 0.01. Say it is 46. Then search h = 37:45, Ntrials = 10.

Just think how much more sleep you can get by designing 118 nets instead of 106,000!

% My goal is to investigate the ability of ANN to predict the missing values of emission gas. I used 6 years of daily variables (6 years =2191 data but 666 of data did not measure in the field so I had 1525 data and 666 as the real gaps)

How long were the gaps?

% also 8 variables as an input layer and one variable as an output layer (which is emission gas). One important thing about my data is inherent variability of gas emission (in a year there are some events which emission gas is so high) and makes it difficult to gap fill by usual methods.

% To compare the performance of ANN to other methods such as the linear method,

? the ANN with H =0 is a linear model

% I create some artificial gaps scenarios( with different gap length)

Please be less vague in explaining " an artificial gap scenario". How many days

min, median, mean, std and max?

You know the 8 inputs but not the output for how long? OR you don't know the inputs either ???

% into emission gas data and run ANN for each artificial gap scenarios. for each artificial gap scenario, I run different networks and selected the best network based on lower RMSE and higher coefficient of determination(R2) between measured data(artificial gaps) and calculated by ANN. Therefore, hidden neurons and Ntrial optimized based on getting the net which had higher R2 and lower RMSE. I tried dividrand and dividind

Spelling: you forgot the e.

% to divide data to tr/val/test sets. I used (4 years for training /one year validation and one year for test)for dividind. Also I used this (70/15/15 )for dividrand. With using Artificial gap scenarios, dividind indicated the good performance(low RMSE and High Rsqured) comparing to dividrand. For filling the real gap data(666 data )since I did not have the real measured values of data to calculate RMSE and other statistics metrics I used NMSE and validation performance to compare.Therefore,dividrand comparing to dividind showed better performance(low NMSE and low validation performance).

Not clear.

% 2- for the real gap values I am trying to optimize H by trial and error and I took your advice. Here are the results of run ANN for H=10:10:100 and Ntrial=10

%H/NMSE/R test/Rsquared/valperformance/Performance 70 0.31 0.65 0.69 106.32 79.90 20 0.33 0.61 0.67 126.13 86.31

I don't understand the last two columns. I would just use a 10x10 matrix of NMSE.

%what is the next step to get the optimal net based on H?

See my above comments regarding finding the smallest satisfactory H. Then you could train a net with all of the data..

% 2- should I keep using "dividind" or switch to "dividrand"? "dividind" for the artificial gap scenarios works well. "dividind" shows better performance for real gaps.

I don't know. It is not clear to me what you did.

Greg

Sign in to comment.

Answer 2

Walter Roberson on 9 Feb 2016

2
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/267380-the-best-ann-configuration#answer_209210

What is the best way of getting around "inner London" (United Kingdom)? Is it taxi, personal automobile, Tube (subway) -- or even bicycle (ha ha ha)?

Did you decide yet? So did you measure "best" by convenience, cost, health benefits, or speed? Or did you construct a carefully weighted measure of all of those, such as being willing to trade 1 minute longer travel time for each 2000 Calories of fat burned? Or should it be 1.5 minutes and 1000 Calories? What scientific study did you use to decide the trade-offs?

Oh, by the way: multiple tests have shown that during a typical work-day afternoon, the lowest cost way of getting around inner London is by bicycle, but the healthiest way of getting around inner London is instead by bicycle; and on the third hand, the fastest way of getting around inner London is... by bicycle.

So when you are selecting an ANN, what do you mean by "best" ?

3 Comments
Show 1 older commentHide 1 older comment

Walter Roberson on 9 Feb 2016

Did you need the lowest false-positive rate? the lowest false-negative rate? Are you predicting values or predicting class?

Rita on 9 Feb 2016

Edited: Walter Roberson on 10 Feb 2016

predicting values.

Sign in to comment.

Answer 3

Greg Heath on 11 Feb 2016

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/267380-the-best-ann-configuration#answer_209511

Open in MATLAB Online

Insuffient information and explanation:

size(input) ? size(target) ?

If I guess both are [ 1 N ] , Ntrn ~ 0.7*N and Hub = 114, then

 (Ntrn-1)/(1+1+1) = 114
  Ntrn = (3*114+1) % 343
  N    = Ntrn/0.7  % 490

Assuming

NMSEgoal <= 0.01 with Hub = 114 % Probably don't need 0.005

why in the world are you even considering

 NMSE = 0.25 @ H = 16 
 and 
 NMSE = 0.51 @  H= 17

instead of increasing H???

Puzzled,

Greg

1 Comment
Show -1 older commentsHide -1 older comments

Greg Heath on 13 Feb 2016

Edited: Greg Heath on 13 Feb 2016

Open in MATLAB Online

Be serious:

1. Have you ever seen me use anything more than numH= numel(Hmin:dH:Hmax)~10 and Ntrials > 15 on the zillions of examples that I have posted in the NEWSGROUP and ANSWERS?

2. Have you ever heard of the saying "DIVIDE AND CONQUER"?

YES? GOOD! Then START with something like

H = 10:10:100, Ntrials =10

to find the minimum H that will yield your goal.

Hope this helps.

Greg

Sign in to comment.

The best Ann configuration

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (2)

3 Comments
Show 1 older commentHide 1 older comment

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

The best Ann configuration

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (2)

3 Comments Show 1 older commentHide 1 older comment

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

3 Comments
Show 1 older commentHide 1 older comment

1 Comment
Show -1 older commentsHide -1 older comments