Main Content

The model for discriminant analysis is:

Each class (

`Y`

) generates data (`X`

) using a multivariate normal distribution. In other words, the model assumes`X`

has a Gaussian mixture distribution (`gmdistribution`

).For linear discriminant analysis, the model has the same covariance matrix for each class; only the means vary.

For quadratic discriminant analysis, both means and covariances of each class vary.

Under this modeling assumption, `fitcdiscr`

infers the mean and covariance parameters of each class.

For linear discriminant analysis, it computes the sample mean of each class. Then it computes the sample covariance by first subtracting the sample mean of each class from the observations of that class, and taking the empirical covariance matrix of the result.

For quadratic discriminant analysis, it computes the sample mean of each class. Then it computes the sample covariances by first subtracting the sample mean of each class from the observations of that class, and taking the empirical covariance matrix of each class.

The `fit`

method does not use prior probabilities or costs for fitting.

`fitcdiscr`

constructs weighted classifiers using the following scheme. Suppose *M* is an *N*-by-*K* class membership matrix:

*M _{nk}* = 1 if observation

The estimate of the class mean for unweighted data is

$${\widehat{\mu}}_{k}=\frac{{{\displaystyle \sum}}_{n=1}^{N}{M}_{nk}{x}_{n}}{{{\displaystyle \sum}}_{n=1}^{N}{M}_{nk}}.$$

For weighted data with positive weights *w _{n}*, the natural generalization is

${\widehat{\mu}}_{k}=\frac{{{\displaystyle \sum}}_{n=1}^{N}{M}_{nk}{w}_{n}{x}_{n}}{{{\displaystyle \sum}}_{n=1}^{N}{M}_{nk}{w}_{n}}.$

The unbiased estimate of the pooled-in covariance matrix for unweighted data is

$\widehat{\Sigma}=\frac{{{\displaystyle \sum}}_{n=1}^{N}{{\displaystyle \sum}}_{k=1}^{K}{M}_{nk}\left({x}_{n}-{\widehat{\mu}}_{k}\right){\left({x}_{n}-{\widehat{\mu}}_{k}\right)}^{T}}{N-K}.$

For quadratic discriminant analysis, `fitcdiscr`

uses *K* = 1.

For weighted data, assuming the weights sum to 1, the unbiased estimate of the pooled-in covariance matrix is

$\widehat{\Sigma}=\frac{{{\displaystyle \sum}}_{n=1}^{N}{{\displaystyle \sum}}_{k=1}^{K}{M}_{nk}{w}_{n}\left({x}_{n}-{\widehat{\mu}}_{k}\right){\left({x}_{n}-{\widehat{\mu}}_{k}\right)}^{T}}{1-{{\displaystyle \sum}}_{k=1}^{K}\frac{{W}_{k}^{\left(2\right)}}{{W}_{k}}},$

where

${W}_{k}={\displaystyle {\sum}_{n=1}^{N}{M}_{nk}{w}_{n}}$ is the sum of the weights for class

*k*.${W}_{k}^{\left(2\right)}={\displaystyle {\sum}_{n=1}^{N}{M}_{nk}{w}_{n}^{2}}$ is the sum of squared weights per class.