Limit divergence of polynomial fitting

7 views (last 30 days)
NF
NF on 15 Apr 2018
Commented: NF on 15 Apr 2018
I am approximating a set of data with a polynomial to later find the cumulative density function. I am using a polynomial since I do not know the distribution of the set of data, that may even vary, and because it should be fully automatized. The problem is thatat the end of my set of data, once the polynomial reaches the last data, divergest to infinity. Since it should approximate a probability density function it shouldn't happen, because like so it influence all the results. How can I fix that?
p=polyfit(X,N,21);
x=linspace(0,4000,1000);
y=polyval(p,x);

Accepted Answer

John D'Errico
John D'Errico on 15 Apr 2018
Edited: John D'Errico on 15 Apr 2018
The simple answer for this is to ask a question. WHY IN THE NAME OF GOD AND LITTLE GREEN APPLES ARE YOU USING A POLYNOMIAL????? Ok, maybe I'm being too hard on you. Polyfit is there for you to use. Nothing in the documentation tells you it does silly things, or that this is the fundamental nature of polynomials. Somewhere teachers should teach you, right after they tell you that you CAN use a polynomial, that you probably, usually, should not do so. And then explain why not.
Seriously, DON'T USE A POLYNOMIAL. And NEVER use a high order polynomial. They go to hell faster than a low order one. A polynomial order of 21? WAY TOO HIGH. But then you should not use a polynomial at all here.
Polynomials do exactly what you saw. They go to infinity. Think about it. what other behavior can you possibly expect? Can a polynomial approximate a function that does non-trivial (non-constant) stuff within some region, then goes constant as x goes to plus or minus infinity? Think about it. A polynomial can never be a model for a process that behaves like a CDF. You have this really high order polynomial, with powers as high as 21. X is as large as 4000?
What is 4000^21? How about other numbers near there?
4000^21
ans =
4.398e+75
3950^21
ans =
3.3771e+75
3500^21
ans =
2.6634e+74
The highest power of X with a non-zero coefficient will dominate the behavior of a polynomial as X goes to +/-inf. So your high order polynomial produces complete and utter garbage out there. You know this is the behavior that you see. And simple mathematics tells you that you cannot expect any other behavior from a polyomial model. That is exactly the behavior you expect from a polynomial, the innate, fundamental behavior that exists in any polynomial, and something you cannot change.
The obvious answer to my original question is you have no idea what else you might use. Sorry, but that is clearly true.
And, the answer to that is to use a spline. There are several choices you might use there. The simplest is pchip, which will interpolate your data. (Of course, you could have just tried interp1, which can use pchip, or various options as you would direct it to use.)
pp = pchip(X,N);
x = linspace(0,4000,1000);
y = ppval(pp,x);
DON'T use the function spline, as it will show oscillatory behavior in the tails. If you use interp1 instead, then the 'linear', 'pchip', or 'cubic' methods should all be ok. While 'linear' will just do piecewise linear interpolation, that is probably entirely adequate for this purpose.
IF you feel that you needed to smooth the curve instead of interpolate it, then download my SLM toolbox from the file exchange. It will require the optimization toolbox. But pchip should be entirely adequate here.
  7 Comments
John D'Errico
John D'Errico on 15 Apr 2018
Edited: John D'Errico on 15 Apr 2018
So the pchip spline is the one that gets upset at the reps? It would do so. And chckxy would be part of that code.
There are two simple solutions. One is to use a tool like my consolidator to remove the replicates.
x = rand(10,1);
x = [x;x];
y = exp(x);
spl = pchip(x,y)
Error using chckxy (line 51)
The data sites should be distinct.
Error in pchip (line 58)
[x,y,sizey] = chckxy(x,y);
Which of course fails.
[xu,yu] = consolidator(x,y,@mean);
spl = pchip(xu,yu);
fnplt(spl)
You could have done something similar using uniquetol in combination with accumarray.
Consolidator is available for download from the file exchange.
NF
NF on 15 Apr 2018
Now it works perfectly. Thank you so much for all the help!!

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!