Standardisation (zero-mean, unit-variance)

4 views (last 30 days)
Hi there, I am working with the neural network toolbox in matlab. My problem is not with the toolbox but with the data preparation before it is used and my resulting output. I standardise the input and targte output on which I train the network but this means that my estimated output, when I simulate the network on a new set of values has also a zero-mean, unit-vairance. I do not want this because while the estimated output correctly follows the trend and amplitude changes of my target output I want the physical values not these standardised values. Is there a way to 'de-standardise' my output i.e. add on the mean and multiply by the standard deviation. The mean of my time series is not constant so I am sure this is not as straightforward as I have said.
Thank you for any help
Sarah

Accepted Answer

Greg Heath
Greg Heath on 30 Jun 2012
> I standardise the input and targte output on which I train the network but this means that my estimated >output, when I simulate the network on a new >set of values has also a zero-mean, unit-vairance.
In the best of all worlds!
Then you can use the mean and variance of the original output data to convert the new output to the correct location and scale.
A basic assumption of NN regression and classification models is that both design (train + val) and nondesign (test) data can be assumed to come from the same probability distribution.
If you expect that assumption might invalid for the new input data, you can always compare the summary statistics (e.g., mean, variance, correlations ...) of the new input data with that of the original input (design + test) data.
You can also compare the outputs when the new input data is normalized with it's own mean and variance vs using the mean and variance of the original data.
Are you using newfit(~newff) or fitnet(~feedforwardnet)?
Are your original standari"z"ations done with mapstd before creating the net or are you using net.input{i}.processFcn, (i = 1,2)?
>I do not want this because while the estimated output correctly follows the trend and amplitude >changes of my target output I want the physical values not these standardised values. Is there a way to >'de-standardise' my output i.e. add on the mean and multiply by the standard deviation. The mean of >my time series is not constant so I am sure this is not as straightforward as I have said.
This readily done using the 'reverse' option of mapstd. I don't remember if it is done automatically or you have to do it explicitly. See the documentation and examples of mapstd (or mapminmax,... the same principles apply).
Hope this helps.
Greg
  13 Comments
Sarah
Sarah on 10 Aug 2012
Thank you for that answer. Re: __A basic assumption is that the important summary statistics of the design data adequately characterizes the important summary statistics of the nondesign data.
The amount they differ will affect the accuracy, precision and confidence of your result.
For my network I look at both winter and summer time data. If I train my network on a full year, meaning that my training summary statistics are for a whole year (an average of the seasons) is it therefore correct to test the network independantly on just winter time data or summer time data. Spring and Autumn statistics are quite similar to one another. Or would it be more correct to train the network for the different seasons?
Thank you for all of your help
Greg Heath
Greg Heath on 18 Mar 2014
There is no absolute correct. Since seasonal means, spreads and correlations are never exactly equal, you have to determine how much difference will end in a result that you can live width

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!