## Multinomial Models for Ordinal Responses

The outcome of a response variable might be one of a restricted
set of possible values. If there are only two possible outcomes, such
as male and female for gender, these responses are called binary responses.
If there are multiple outcomes, then they are called polytomous responses.
Some examples of polytomous responses include levels of a disease
(mild, medium, severe), preferred districts to live in a city, the
species for a certain flower type, and so on. Sometimes there might
be a natural order among the response categories. These responses
are called *ordinal responses*.

The ordering might be inherent in the category choices, such as an individual being not satisfied, satisfied, or very satisfied with an online customer service. The ordering might also be introduced by categorization of a latent (continuous) variable, such as in the case of an individual being in the low risk, medium risk, or high risk group for developing a certain disease, based on a quantitative medical measure such as blood pressure.

You can specify a multinomial regression model that uses the natural ordering among the response categories. This ordinal model describes the relationship between the cumulative probabilities of the categories and predictor variables.

Different link functions can describe this relationship with logit and probit being the most used.

**Logit:**By default, the`fitmnr`

function uses the`logit`

link function to create a`MultinomialRegression`

model object with ordinal categories. (You can specify a different link function using the`Link`

name-value argument in`fitmnr`

.) The resulting`MultinomialRegression`

model object models the*log cumulative odds*—the logarithm of the ratio of the probability that a response belongs to a category with a value less than or equal to category*j*, P(*y*≤*c*_{j}), and the probability that a response belongs to a category with a value greater than category*j*, P(*y*>*c*_{j}).Ordinal models are usually based on the assumption that the effects of predictor variables are the same for all categories on the logarithmic scale. That is, the model has different intercepts but common slopes (coefficients) among categories. This model is called a

*parallel regression*or*proportional odds*model, and is the default for ordinal responses.The proportional odds model is

$$\begin{array}{l}\mathrm{ln}\left(\frac{P(y\le {c}_{1})}{P(y>{c}_{1})}\right)=\mathrm{ln}\left(\frac{{\pi}_{1}}{{\pi}_{2}+\cdots +{\pi}_{k}}\right)={\alpha}_{1}+{\beta}_{1}{X}_{1}+{\beta}_{2}{X}_{2}+\cdots +{\beta}_{p}{X}_{p},\\ \mathrm{ln}\left(\frac{P(y\le {c}_{2})}{P(y>{c}_{2})}\right)=\mathrm{ln}\left(\frac{{\pi}_{1}+{\pi}_{2}}{{\pi}_{3}+\cdots +{\pi}_{k}}\right)={\alpha}_{2}+{\beta}_{1}{X}_{1}+{\beta}_{2}{X}_{2}+\cdots +{\beta}_{p}{X}_{p},\\ \text{\hspace{1em}}\text{\hspace{1em}}\vdots \\ \mathrm{ln}\left(\frac{P(y\le {c}_{k-1})}{P(y>{c}_{k-1})}\right)=\mathrm{ln}\left(\frac{{\pi}_{1}+{\pi}_{2}+\cdots +{\pi}_{k-1}}{{\pi}_{k}}\right)={\alpha}_{k-1}+{\beta}_{1}{X}_{1}+{\beta}_{2}{X}_{2}+\cdots +{\beta}_{p}{X}_{p},\end{array}$$

where

*π*,_{j}*j*= 1, 2, ...,*k*, are the category probabilities.For example, for a response variable with three categories, there are 3 – 1 = 2 equations as follows:

$$\begin{array}{l}\mathrm{ln}\left(\frac{\pi {}_{1}}{\pi {}_{2}+\pi {}_{3}}\right)={\alpha}_{1}+{\beta}_{1}{X}_{1}+{\beta}_{2}{X}_{2}+\cdots +{\beta}_{p}{X}_{p},\\ \mathrm{ln}\left(\frac{\pi {}_{1}+\pi {}_{2}}{\pi {}_{3}}\right)={\alpha}_{2}+{\beta}_{1}{X}_{1}+{\beta}_{2}{X}_{2}+\cdots +{\beta}_{p}{X}_{p}.\end{array}$$

Under the proportional odds assumption, the partial effect of a predictor variable

*X*is invariant to the choice of the response variable category,*j*. For example, if there are three categories, then the coefficients express the impact of a predictor variable on the relative risk or log odds of the response value being in category 1 versus categories 2 or 3, or in category 1 or 2 versus category 3.Thus, a unit change in variable

*X*_{2}would mean a change in the cumulative odds of the response value being in category 1 versus categories 2 or 3, or category 1 or 2 versus category 3 by a factor of exp(*β*_{2}), given all else equal.You can alternatively fit a model with different intercept and slopes among the categories by using the

`'interactions','on'`

name-value pair argument. However, using this option for ordinal models when the equal slopes model is true causes a loss of efficiency (you lose the advantage of estimating fewer parameters).**Probit:**The`'link','probit'`

name-value pair argument uses the*probit*link function which is based on a normally distributed latent variable assumption. For ordinal response variables this is also called an*ordered probit*model. Consider the regression model that describes the relationship of a latent variable*y** of an ordinal process and a vector of predictor variables,*X*,$${y}^{*}=\beta X+\epsilon ,$$

where the error term

*ε*has a standard normal distribution. Suppose there is the following relationship between the latent variable*y** and the observed variable*y*:$$\begin{array}{l}y={c}_{1}\text{\hspace{1em}}if\text{\hspace{1em}}{\alpha}_{0}<{y}^{*}\le {\alpha}_{1},\\ y={c}_{2}\text{\hspace{1em}}if\text{\hspace{1em}}{\alpha}_{1}<{y}^{*}\le {\alpha}_{2},\\ \text{\hspace{1em}}\vdots \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\vdots \\ y={c}_{k}\text{\hspace{1em}}if\text{\hspace{1em}}{\alpha}_{k-1}<{y}^{*}\le {\alpha}_{k},\end{array}$$

where

*α*_{0}= – ∞ and*α*_{k}= ∞. Then, the cumulative probability of*y*being in category*j*or one of earlier categories, P(*y*≤*c*_{j}), is equal to$$P\left(y\le {c}_{j}\right)=P\left({y}^{*}<{\alpha}_{j}\right)=P\left(\beta X+\epsilon <{\alpha}_{j}\right)=P\left(\epsilon <{\alpha}_{j}-\beta X\right)=\Phi \left({\alpha}_{j}-\beta X\right),$$

where Φ is standard normal cumulative distribution function. Thus,

$${\Phi}^{-1}\left(P\left(y\le {c}_{j}\right)\right)={\alpha}_{j}-\beta X,$$

where

*α*corresponds to the cut points of the latent variable and the intercept in the regression model. This only holds under the assumptions of a normal latent variable and parallel regression. More generally, for a response variable with_{j}*k*categories and multiple predictors, the ordered probit model is$$\begin{array}{l}{\Phi}^{-1}\left(P\left(y\le {c}_{1}\right)\right)={\alpha}_{1}+{\beta}_{1}{X}_{1}+\cdots +{\beta}_{p}{X}_{p},\\ {\Phi}^{-1}\left(P\left(y\le {c}_{2}\right)\right)={\alpha}_{2}+{\beta}_{1}{X}_{1}+\cdots +{\beta}_{p}{X}_{p},\\ \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\vdots \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\vdots \\ {\Phi}^{-1}\left(P\left(y\le {c}_{k-1}\right)\right)={\alpha}_{k-1}+{\beta}_{1}{X}_{1}+\cdots +{\beta}_{p}{X}_{p},\end{array}$$

where P(

*y*≤*c*_{j}) =*π*_{1 }+*π*_{2}+ ... +*π*_{j}.The coefficients indicate the impact of a unit change in the predictor variable on the likelihood of a state. A positive coefficient,

*β*_{1}, for example, indicates an increase in the underlying latent variable with an increase in the corresponding predictor variable,*X*_{1}. Hence, it causes a decrease in P(*y*≤*c*_{1}) and an increase in P(*y*≤*c*_{k}).

After estimating the model coefficients by using `fitmnr`

to create a
`MultinomialRegression`

model object, you can estimate the cumulative
probabilities in each category by using `predict`

with the name-value argument `ProbabilityType="cumulative"`

.
`predict`

accepts the `MultinomialRegression`

model object returned by `fitmnr`

, and estimates the category labels,
categorical probabilities, and confidence bounds for each categorical probability. You
can specify whether `predict`

returns category, cumulative, or
conditional probabilities using the `ProbabilityType`

name-value
argument.

## References

[1] McCullagh, P., and J. A. Nelder. *Generalized
Linear Models*. New York: Chapman & Hall, 1990.

[2] Long, J. S. *Regression Models for Categorical
and Limited Dependent Variables*. Sage Publications, 1997.

[3] Dobson, A. J., and A. G. Barnett. *An Introduction
to Generalized Linear Models*. Chapman and Hall/CRC. Taylor
& Francis Group, 2008.

## See Also

`fitglm`

| `fitmnr`

| `predict`

| `glmfit`

| `glmval`