Multivariate GLMFIT and GLMVAL
5 views (last 30 days)
Show older comments
I have a problem with plotting the results of a glm model with several predictors.
I first fit the model with GLMFIT as:
[b,dev,stats]= glmfit(x, y, distr);
where 'x' is a matrix of N observations by M predictors, and 'y' a vector of observed responses.
Then I produce predictions as:
xpred= zeros(100,size(x,2));
for k= 1:size(x,2)
xpred(:,k)= linspace(min(x(:,k)),max(x(:,k)),100);
end
ypred= glmval(b,xpred,link,stats);
Then I want to plot the effect of each individual predictor on the response variable as:
for j= 1:size(x,2)
figure(j);
plot(x(:,j),y,'k.',xpred(:,j),ypred,'b-','LineWidth',1)
end
But the line (predicted values) for most of the predictors goes totally out of the cloud of observed values. Am I doing something totally wrong or is just that the fit is poor? Any help would be much appreciated. Thanks
0 Comments
Accepted Answer
Tom Lane
on 24 Jan 2012
Consider the code below. It fits a multi-predictor model, but plots the fit as a function of one predictor at a time, with the others held fixed at their mean values. In contrast, your code has each predictor increase along with the others.
This has its own issues. For one thing, the superimposed points represent data, and may or may not be comparable to the line that constrains predictors to their means. But it is one of the things you can do to visualize a multi-predictor fit as a function of one variable at a time.
x = mvnrnd([0 0],[1 .9;.9 1],200);
y = 5*x(:,1) + 5*x(:,2) + randn(200,1);
[b,dev,stats] = glmfit(x,y,'normal');
% Create an array with all predictors at their mean
meanx = mean(x);
xpred = repmat(meanx,100,1);
xplot = xpred;
ypred = zeros(100,size(x,2));
for k= 1:size(x,2)
% Allow the kth predictor to vary, others stay fixed
xplot(:,k)= linspace(min(x(:,k)),max(x(:,k)),100);
xpred(:,k) = xplot(:,k);
ypred(:,k) = glmval(b,xpred,'identity',stats);
xpred(:,k) = meanx(k); % put the mean back
end
for j= 1:size(x,2)
figure(j);
plot(x(:,j),y,'k.',xplot(:,j),ypred(:,k),'b-','LineWidth',1)
end
If you run this you'll see that the lines don't match the points very well. That's because I generated the x values with correlation 0.9, meaning that it's unlikely that one predictor will stay fixed while the other increases. If you adjust 0.9 down toward 0.0, you should see better agreement between the fit and the data points.
0 Comments
More Answers (2)
Peter Perkins
on 19 Jan 2012
Francisco, it's pretty hard to say what's happening without knowing anything about the data or the model, but this may be an artifact of the way you've created xpred. Notice that the first row of xpred has the min values for all of the predictors, and the last row has the max values. You've created what is essentially a straight line through the design variable space that may or may not have anything to do with the actual cloud of points in your data. That could explain why ypred doesn't look much like y.
Hope this helps.
0 Comments
See Also
Categories
Find more on Analysis of Variance and Covariance in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!