Binomial Data with 0's in glmfit
Show older comments
I am using glmfit to perform logistic regression on a set of data. If I run the code
b = glmfit(X,[y n],'binomial','link','logit')
where y contains a few 0 values, how does Matlab handle the 0's? Are they set to very small values? Is it necessary or beneficial to use the empirical log-odds
log((y+0.5)/(n-y+0.5)
to determine y_new and n_new (where y_new does not contain 0's)?
Thank you.
Answers (1)
the cyclist
on 15 Sep 2015
0 votes
In a logistic regression, the response variable (y) is typically a binary variable (and can represented as 0's and 1's).
Why do you think 0's would be a problem?
2 Comments
Anna
on 15 Sep 2015
the cyclist
on 15 Sep 2015
In the logistic model, you would only say the probability is equal to 1 as X approaches infinity.
It's true that if for some particular value of X, you happen to see all "successes" (say, 15 out of 15 successes when X = 300), then the code is going to make a starting estimate of the probability (at that value of X) to be just a bit smaller than 1, while it tries to find the best fit across all values of X.
I think the only time you will have a problem fitting is if you see only successes at all values of X. But, then you don't really need a model, do you? :-)
Categories
Find more on Binomial Distribution in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!