How can I get a good exponential fit using either the curve fitting toolbox or the lines of code?

I have several sets of data, which I label xdata_p and ydata_p. Unfortunately, it's a large set, so I didn't see much way around just putting the data points in individually. To clean up the program, what I did was put these in their own function, called HeliumData_Voyager, ProtonData_Voyager, and others (since I'm actually repeating this process for several data sets). So in the program below, you'll see it call that program, which is how I generate the data sets.
xdata = [0.051120874,0.07516499,0.11051797,0.14458315,0.5287479,0.6836915,...
0.81462735,0.99358326,1.1977779,1.4271679,1.5853835,1.7406857,1.911201,...
2.09842,2.3039784,2.529673,2.7452202,3.0853872,3.387628,3.676279,4.0364027,...
4.4838777,4.865937,5.3425984,6.075168,6.516251,7.3236966,8.041117,8.828814,...
9.469823,10.519647,11.5501375,12.830584,13.923845,15.287807,16.785381,18.429657]; %This will be the energy
ydata = [953.125,914.0625,877.6042,848.9583,502.60416,507.8125,513.0208,513.0208,...
507.8125,502.60416,489.58334,481.77084,473.95834,463.54166,450.52084,440.10416,...
427.08334,411.45834,390.625,375.0,361.97916,335.9375,315.10416,299.47916,...
273.4375,260.41666,231.77083,216.14583,195.3125,171.875,153.64583,130.20833,...
114.583336,88.541664,70.3125,41.666668,26.041668]; %This will be the flux
I have to call these values into a function which plots them, and then which tries to do a best fit line. I originally tried linearizing the data and re-exponentiating it, and that didn't work at all too well. Here is my attempt there:
clc
clear all
ElectronData_Voyager;%This simply calls the data file which we are to use. The reason for this
%is to simply save space in the event of huge data files.
m =1; %this allows us to choose the power of the polynomial function
E = polyfit(log(xdata),log(ydata),m); %This is the function we are attempting to use in order
%to get the best polynomial fit. It pops it out in order A_n...A_2, A_1, A_0
Z = polyval(E,log(xdata));
A = norm(ydata-Z);%used below in RMS
n = length(ydata);%used below in RMS
rms = A/sqrt(n)%This will allow me to assess the quality of the fit.
loglog(xdata,ydata,'o')
title('Plot of the Original Data')
xlabel('KE')
ylabel('Flux')
grid on
The above seems to do OK at plotting the points, but not much else. I then tried to use the curve fitting toolbox in matlab. What it does, when I choose exponential, is give me an exponential curve, but the curve is completely wrong (i.e. the concavity is flipped).
So then, I tried this code just to see what happens:
clc
clear all
ElectronData_Voyager;%This simply calls the data file which we are to use. The reason for this
%is to simply save space in the event of huge data files.
f = fit(xdata',ydata','exp2')
loglog(xdata',ydata','o')
Once again, it yields me with an exponential that is clearly the wrong function. So my question is how can I get a good fit to the data, and/or where is it that I'm going wrong in my code?
Also, the first image is what the data should look like, the second is what it looks like either using the curvefitting toolbox or the above line of code.
Thanks in advance!

2 Comments

When I plot the data you provide the way you show, I do not get the same result as you do. Your "Plot of the Original Data" shows no Flux value greater than 10^1, but the ydata you provided goes up past 900, almost 10^3 . The shapes of the curves are a bit different.
It is not easy for us to advise you when the data you post does not match the data of the results we are intended to compare against.
My apologies. I am running this for three sets of data, and I accidentally grabbed the wrong one. Let me copy the correct data set for the graph that I sent you guys:
xdata = [0.0021103637,0.0031923044,0.0044332454,0.007037563,0.010644306,... 0.017104669,0.026829293,0.03422094,0.043652304,0.075480916,0.13703811,... 0.19506922,0.25811148,0.32531464,6.516636,7.097216,7.6358223,8.21518,... 9.167466,9.983916,10.873566,12.133824,12.740454,13.707529,15.483948,16.258066]; %This will be the energy ydata = [3.2759476,2.5621476,2.8759978,3.5635407,3.1785717,3.8112128,... 3.877242,3.8156,3.458803,2.9860823,2.3361485,1.7390722,1.3823683,... 1.0632223,0.008671838,0.007121112,0.0060429154,0.005212914,0.0042808973,... 0.0036328065,0.0029345637,0.0024498142,0.0021482513,0.0017932812,... 0.0014486917,0.0012703632]; %This will be the flux
The correct curve for this is attached.

Sign in to comment.

 Accepted Answer

What is the physical process that describes your data?
What are you modeling? Is there a function that describes it?
If you simply want to model an exponential function without transforming to logs (it is always best to model the data without transformation), this works:
f = @(b,x) b(1) .* exp(-b(2).*x);
b0 = [1000; 0.1];
B = nlinfit(xdata, ydata, f, b0);
figure(1)
plot(xdata, ydata, 'bp')
hold on
plot(xdata, f(B,xdata))
hold off
grid
I don’t have the Curve Fitting Toolbox (Statistics and Optimization do all I need), but this function would likely work with it as a ‘custom funciton’.

19 Comments

Let me give that a shot when I get back to my laptop to run the code, but to answer your first question, the data is from Voyager 1, and is the Cosmic Ray Interstellar Spectra. So on the Y-axis, we have the Flux of the particles, and on the X-axis is the kinetic energy. We're trying to measure the Cosmic Ray modulation effect using the cosmic ray interstellar spectra data for several species (electrons, protons, etc) from the low energy sources of Voyager 1, and the high energy sources from PAMELA.
Also, so far there is no function which does it, we are doing a best fit in order to try to generate the function at some point.
That’s probably far beyond my knowledge of physics. I can, however, help with the curve-fitting.
It runs, but it appears that the concavity is still the wrong graph, and it's a bit splotchy (not a smooth curve). See the attached graph for what I get, and I'm also attaching a second graph to show what it should be.
I'm pretty sure an inverse exponential, or an inverse power law is the right fit just from the look of it.
The function I used is an inverse exponential, so if it doesn’t work, I don’t know what you would substitute for it.
A power law would be:
f = @(b,x) b(1) .* x.^b(2);
however when I did a fit with it, it was actually not as good as the exponential.
I am comforted that in all the time since these data were collected no one has developed a model for it, I am at least in good company!
I’ll delete my Answer in another day, since it obviously isn’t contributing anything.
I'll actually submit your answer as the best because you have gotten me further (so your contribution is tremendously helpful!) I'm just not sure where it could be going wrong.
It graphs the data just fine when I simply plot the points, but as soon as I do a curve fit, it gives some weird thing. Do you think that somehow the data, or the data format could be an issue?
I appreciate your Accepting my Answer.
I’m not certain you’re going wrong. To do a parameter estimation of a process, it is necessary to understand and model the process. I was hoping to find out something about it online, but either wasn’t searching correctly or nothing exists, because I found nothing of obvious relevance.
You can get a smoother curve by creating a more detailed vector for ‘xdata’, not for the parameter estimation but to use for the plot. For instance:
f = @(b,x) b(1) .* exp(-b(2).*x);
xdata = [ ... ];
ydata = [ ... ];
xdc = linspace(min(xdata), max(xdata), 150); % Continuous ‘xdata’
b0 = [10; 0.1];
B = nlinfit(xdata, ydata, f, b0);
figure(1)
plot(xdata, ydata, 'bp')
hold on
plot(xdc, f(B,xdc))
hold off
grid
That will give you a better idea of what your curve is doing in the gaps in the data.
I doubt the problem is the data or data format. For whatever reason, a good model of the process you’re observing does not yet exist. (I considered modeling it as part of a black-body radiation curve, but my understanding of your data is not sufficient to do the necessary conversions to energy or frequency. That might be something for you to consider, if you believe it relevant to your data. I was relying on my long-ago undergraduate physics and physical chemistry for inspiration on processes that could explain it.)
Unless you have reason to question the reliability of your data, you have to assume they are correct, and that they in part describe the environment Voyager I was in when it gathered them. Your task then would seem to be to describe the part of that environment your data represent. I don’t know how old your data are (they could date to the Voyager I launch in 1977), but you seem to be the first to have worked with them. I would find it reassuring that no physical explanation of them exist after 38 years. You have the opportunity to make an original contribution.
I discovered something helpful: since the range of data goes from such a low amount to a high amount, I decided to graph it as a loglog instead of plot....that gave me the right concavity. So this clearly worked. I think it was just a display thing since the range of data goes from super small to super large in a very short time.
With that in mind, I have a few questions: 1) for display purposes, if I graph the code with this part: figure(1)
loglog(xdata, ydata, 'bp')
hold on
and omit the rest, it displays like it should (since that's just the data). When I say like it should, I mean like my original graph of the data that I attached earlier. But when I add the code: loglog(xdata, f(B,xdata))
hold off
grid
(even if I do plot instead of loglog on the second graph) it looks different. Is there a way to get it to look the same and just add the curve? or is there some graphical way (such as using loglog instead of plot as I did for the first part) to get it to match up the scales?
My second question is about the coefficients. How do I grab the coefficients/read them?
And finally, it seems to fit the low energy data (i.e. the first few points pretty decently, but it comes nowhere near the second set of data (i.e. the second set of points). It actually doesn't have a smooth curve, but rather a kind of piecewise thing. I attached that. Is that a viewing issue or some other issue?
This is the absolute closest I've been, so I think we're almost there!
Thanks again for all of your help and input! Without you, I'd be much more lost! :)
also, I wrote this prior to realizing you responded before. I'm not getting email updates, so I'll go back and look at what you've suggested.
So far as the data, it's actually what I'm working on my PhD in. I'm much more versed in the physics, and learning the computing as I go (I've worked with Matlab in the past, but never have done some of the things I'm doing right now). So I can understand/explain the physical processes, but alas Matlab is giving me the problem at the moment. The data, itself, is actually from when Voyager crossed the heliopause, so the data itself is relatively new, and that's what we're working with.
also, running the above addition yields me with this error related to this piece of the code:
ydata = [...];
Error: File: TestPOLY.m Line: 37 Column: 7 The expression to the left of the equals sign is not a valid target for an assignment.
I copied the part of the code simply because I have a lot of commented out things hence why ydata = [...]; is at line 37. I don't see any reason why that shouldn't work...
I’m a bit lost. I experimented with a loglog plot, but since I don’t have a significant understanding of what you’re doing, I didn’t pursue it.
The lines:
xdata = [ ... ];
ydata = [ ... ];
are simply proxies for the latest data you posted (21 Jul 2015 at 17:14) in your reply to Walter’s comment. I saw no reason to copy them again here.
The rest of my code (in my previous comment) worked. I don’t know what could be throwing the error.
In the way of experimenting with various functions to fit, I added x- and y-offsets to the functions I used earlier. Experiment with them to see if they might help:
f = @(b,x) b(1) .* exp(-b(2).*(x + b(3))) + b(4);
f = @(b,x) exp(b(1)) .* (x + b(2)).^b(3) + b(4);
The second one actually looks promising.
I remember hearing about Voyager I crossing the heliopause. It fascinates me, but my background is in medicine and biomedical engineering (largely control and signal processing), and a few other things. I read about other disciplines in Science, but that’s about it. My first exposure to computing was an undergraduate FORTRAN course in 1968. Computers have been a part of my life in some form almost ever since.
Thanks again for the help. I believe loglog is mainly for display purposes insofar as I'm concerned. All it seems to do is scale the graph in a way that is more visually pleasing, and looks more like the graphs from Voyager, from where I actually got my data points. In order to generate that data for my uses, I had to use plot digitizer to grab them off of the graphs, since the data itself isn't readily available, but they have plots of it all. They use a logarithmic scale simply because you have a low energy regime, and a high energy regime, all of which is being plotted on the same graph. The variables, themselves, are not logarithmic: it's simply for display purposes. I don't actually need it to display the same way, but rather that is how I was testing to ensure that it looked roughly right. Hopefully that makes sense.
I see the mistake I made there, that one was entirely my fault in that I accidentally commented something out and didn't notice it since it was in an earlier line of code.
However, I now get this error: Error using nlinfit (line 205) Error evaluating model function '@(b,x)exp(b(1)).*(x+b(2)).^b(3)+b(4)'.
Error in BestFitCosmicRaySpectra (line 11) B = nlinfit(xdata, ydata, f, b0);
Caused by: Attempted to access b(3); index out of bounds because numel(b)=2.
This is the code as it sits (and I tried it both with loglog and with plot, this is it copied with plot). Also, I had the error in my original program, but because I had a number of other things, I decided to open a new .m file and try that. Same error.
clc
clear all
HeliumData_Voyager;%This simply calls the data file which we are to use. The reason for this
%is to simply save space in the event of huge data files.
f = @(b,x) exp(b(1)) .* (x + b(2)).^b(3) + b(4);
xdc = linspace(min(xdata), max(xdata), 150); % Continuous ‘xdata’
b0 = [10; 0.1];
B = nlinfit(xdata, ydata, f, b0);
figure(1)
plot(xdata, ydata, 'bp')
hold on
plot(xdc, f(B,xdc))
hold off
grid
I played with the b0 values, but it makes no change that I can see. On my end, it works fine with the f = @(b,x) b(1) .* exp(-b(2).*x); function, but not with either of the offset functions. Do you have any ideas what that error means?
Wow! Biomedical engineering is absolutely fascinating. Some of my good friends here in Florida are biomedical engineers and/or are in school for it. So I get to talk with them about what they do quite a bit actually! Before I got into astrophysics, I worked as a researcher in nanoparticles in biophysics applications, and we worked with a lot of biomedical engineers. I've dabbled a little in Fortran, being that I had a computational physics class and had done some things in magnetism before I moved into the astrophysics side of things. I can say that your experience certainly shows in computers because you've been able to help me every step of the way. I truly appreciate it.
I should have updated ‘b0’ in the code I posted. I used this vector:
b0 = [10; 0.1; 0.1; 0.1];
The number of entries in the initial estimate vector has to equal the number of parameters in the model.
I share your outlook. There are too many interesting scientific disciplines and never enough time to explore them all to the extent I would like. The articles in Science and episodes of NOVA help me significantly. The challenge of biomedical engineering from a signal processing and control perspective is that biological systems are inherently nonlinear and nonstationary. Keeps life interesting.
That did it. I'm going to play with this a bit, but I have just a few clarifying questions for you just so I can be certain I've got this before I close this question out.
Once I get a best fit, it'll be of the form b(1) .* exp(-b(2).*(x + b(3))) + b(4)
If I want to know what the coefficients are above, how can I see them?
I.e. in analytical terms, it is like saying: y = a*exp(-b*(x+c))+d from what I can tell. How do I get the coefficients a, b, c,d? My initial thought was to remove the semicolon, like I did when I was attempting to use polyfit, but that doesn't do anything.
also, I'm missing a set of parenthesis in the above equation, but that doesn't actually matter for my specific question (just pointing out the fact that I had).
The coefficients are in the ‘B’ returned by nlinfit. They would correspond in your ‘y = a*exp(-b*(x+c))+d’ expression to:
a = B(1)
b = B(2)
c = B(3)
d = B(4)
That answers it all! Thanks! You've been incredibly helpful on this matter, and have gotten me extremely far! This is exactly the help I needed!
I'm going to close this question out now, and later I may have another question regarding something else on the code (error bars and the such, and maybe other questions when I fit the other data sets), but as that is a different topic, that'll be for another day after I've played with this for a bit!
Thanks again for all of your help!
As always, my pleasure!
The fun part of MATLAB Answers are the interesting Questions and topics I encounter. I wish you luck in your research, and your discoveries about deep space cosmic rays.

Sign in to comment.

More Answers (0)

Categories

Find more on 2-D and 3-D Plots in Help Center and File Exchange

Asked:

CAM
on 18 Jul 2015

Edited:

CAM
on 25 Jul 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!