Multiple Linear Regression Trouble

Question

0 votes

IgorPlot.PNG

I'm using regression techniques to attempt spectra unfolding but the result matlab is giving is wrong. I start with several response functions that have been modeled for mono-energetic values like in IgorPlot.png. Each of these responses is then multiplied by a unique coefficient "a(n)" and the summation is taken. Using regression techniques, the summation is then fitted to an example function like in FitTest.png to produce the coefficient matrix "a" which should indicate the which modeled response functions are most prevalent in the example. This technique has been proven to work in similar applications, but I'm running into a problem where, as can be seen in FitTest, matlab favors the last response function above every other function used.

I've tried rewriting the code to include both fewer and more responses and in both cases, regardless of the number of responses used, Matlab will consistently favor there last response listed in the summation equation. Am I going about this completely wrong, or is there something I'm missing here?

6 Comments
Show 4 older comments Hide 4 older comments

Jess on 7 Jul 2014

Edited: Star Strider on 7 Jul 2014

Open in MATLAB Online

I've modified a basic example from one of my textbooks to handle y as a function of 13 "x" variables. The matrix and equation setup I used are shown below, unfortunately the data I'm using to do this cannot be posted here. All I want to know is why matlab insists on telling me that the last value it finds in Coeff is always the largest, when I know for a fact it shouldn't be.

X_matrix = [n,sum(A0_5),sum(A1_0),sum(A1_5),sum(A2_0),sum(A2_5),sum(A3_0),sum(A3_5),sum(A4_0),sum(A4_5),sum(A5_0),sum(A5_5),sum(A6_0);
     sum(A0_5),sum(A05A05),sum(A05A10),sum(A05A15),sum(A05A20),sum(A05A25),sum(A05A30),sum(A05A35),sum(A05A40),sum(A05A45),sum(A05A50),sum(A05A55),sum(A05A60);
     sum(A1_0),sum(A05A10),sum(A10A10),sum(A10A15),sum(A10A20),sum(A10A25),sum(A10A30),sum(A10A35),sum(A10A40),sum(A10A45),sum(A10A50),sum(A10A55),sum(A10A60);
     sum(A1_5),sum(A05A15),sum(A10A15),sum(A15A15),sum(A15A20),sum(A15A25),sum(A15A30),sum(A15A35),sum(A15A40),sum(A15A45),sum(A15A50),sum(A15A55),sum(A15A60);
     sum(A2_0),sum(A05A20),sum(A10A20),sum(A15A20),sum(A20A20),sum(A20A25),sum(A20A30),sum(A20A35),sum(A20A40),sum(A20A45),sum(A20A50),sum(A20A55),sum(A20A60);
     sum(A2_5),sum(A05A25),sum(A10A25),sum(A15A25),sum(A20A25),sum(A25A25),sum(A25A30),sum(A25A35),sum(A25A40),sum(A25A45),sum(A25A50),sum(A25A55),sum(A25A60);
     sum(A3_0),sum(A05A30),sum(A10A30),sum(A15A30),sum(A20A30),sum(A25A30),sum(A30A30),sum(A30A35),sum(A30A40),sum(A30A45),sum(A30A50),sum(A30A55),sum(A30A60);
     sum(A3_5),sum(A05A35),sum(A10A35),sum(A15A35),sum(A20A35),sum(A25A35),sum(A30A35),sum(A35A35),sum(A35A40),sum(A35A45),sum(A35A50),sum(A35A55),sum(A35A60);
     sum(A4_0),sum(A05A40),sum(A10A40),sum(A15A40),sum(A20A40),sum(A25A40),sum(A30A40),sum(A35A40),sum(A40A40),sum(A40A45),sum(A40A50),sum(A40A55),sum(A40A60);
     sum(A4_5),sum(A05A45),sum(A10A45),sum(A15A45),sum(A20A45),sum(A25A45),sum(A30A45),sum(A35A45),sum(A40A45),sum(A45A45),sum(A45A50),sum(A45A55),sum(A45A60);
     sum(A5_0),sum(A05A50),sum(A10A50),sum(A15A50),sum(A20A50),sum(A25A50),sum(A30A50),sum(A35A50),sum(A40A50),sum(A45A50),sum(A50A50),sum(A50A55),sum(A50A60);
     sum(A5_5),sum(A05A55),sum(A10A55),sum(A15A55),sum(A20A55),sum(A25A55),sum(A30A55),sum(A35A55),sum(A40A55),sum(A45A55),sum(A50A55),sum(A55A55),sum(A55A60);
     sum(A6_0),sum(A60A60),sum(A10A60),sum(A15A60),sum(A20A60),sum(A25A60),sum(A30A60),sum(A35A60),sum(A40A60),sum(A45A60),sum(A50A60),sum(A55A60),sum(A60A60)];
Y1_matrix = [sum(y1);sum(A05y1);sum(A10y1);sum(A15y1);sum(A20y1);sum(A25y1);sum(A30y1);sum(A35y1);sum(A40y1);sum(A45y1);sum(A50y1);sum(A55y1);sum(A60y1)];
Coeff = X_matrix\Y1_matrix;

dpb on 7 Jul 2014

...Image that didn't attach.

How in the world did you generate the wiggles in the "LSR fitted response"??? That would take a heckuva' lot more than 13 terms it would seem...

Jess on 8 Jul 2014

The "wiggles" as you call them are actually a result of something called the Bragg peak from nuclear physics and are present in each of the 13 base responses used. Although the angle may not be ideal in the IgorPlot file, you can still see these peaks to an extent.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

dpb on 8 Jul 2014

Edited: dpb on 8 Jul 2014

Open in MATLAB Online

1 vote

I think on reflection the comment I made earlier about Matlab building the X.'*X matrix actually is your problem. I just happened to have the data still in memory from the current other thread where a poor misguided soul is trying to fit a nonlinear model of incredible complexity to a set of data that are an essentially perfect quadratic. Anyway, having solved for that quadratic with polyfit, I demonstrate the same computation with mldivide --

 >> x=[1:length(yh)].';  % I'd lost his x; just used 1:N instead for demo
 >> b=polyfit(x,yh,2)    % the polyfit results
b =
     -4.1158e-05    0.0600  -22.9005

Now build the design matrix X for a quadratic...

>> X=[x.*x x ones(size(x))];  % the design matrix
>> best=X\y                   % solve the least squares problem
>> best
best =
 -0.0000
  0.0600
-22.9005
>> b.'-best
ans =
 1.0e-14 *
 -0.0000
  0.0028
 -0.7105
>>

Note the results aren't identical but are within E-14 or so which is down in the precision noise level.

Hence, NB: Matlab formed the X'*X matrix internally in the process of solving via \, you do not need to create it and it'll be the wrong answer if you do. I'm guessing there's your source of confusion and difficulty in understanding your results.

4 Comments
Show 2 older comments Hide 2 older comments

Jess on 8 Jul 2014

Thank you for clarifying that. I've done a shortened run to test it out, and allowing matlab to generate the x-matrix appears to have solved part of the problem with the coefficients. I'll try running one of the larger codes and let you know how it goes.

dpb on 8 Jul 2014

Open in MATLAB Online

For a linear model in N variables, the design matrix would be X=[x1 x2 x3 ... xN];...

NB: this is a zero-intercept model; your X_matrix including the term n indicates you probably have an intercept. The design matrix in that case includes a column of ones...

X=[ones(size(x1) x1 x2 x3 ... xN];

for the model

y=a+b*x1+c*x2+...

Sign in to comment.

Multiple Linear Regression Trouble

6 Comments
Show 4 older comments Hide 4 older comments

Answers (1)

4 Comments
Show 2 older comments Hide 2 older comments

Categories

Products

Tags

Community Treasure Hunt

Multiple Linear Regression Trouble

6 Comments Show 4 older comments Hide 4 older comments

Answers (1)

4 Comments Show 2 older comments Hide 2 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

6 Comments
Show 4 older comments Hide 4 older comments

4 Comments
Show 2 older comments Hide 2 older comments