Predict labels using discriminant analysis classification model

`[`

also
returns:`label`

,`score`

,`cost`

]
= predict(`Mdl`

,`X`

)

A matrix of classification scores (

`score`

) indicating the likelihood that a label comes from a particular class. For discriminant analysis, scores are posterior probabilities.A matrix of expected classification cost (

`cost`

). For each observation in`X`

, the predicted class label corresponds to the minimum expected classification cost among all classes.

`Mdl`

— Discriminant analysis classification model`ClassificationDiscriminant`

model object | `CompactClassificationDiscriminant`

model
objectDiscriminant analysis classification model, specified as a `ClassificationDiscriminant`

or `CompactClassificationDiscriminant`

model
object returned by `fitcdiscr`

.

`X`

— Predictor data to be classifiednumeric matrix | table

Predictor data to be classified, specified as a numeric matrix or table.

Each row of `X`

corresponds to one observation,
and each column corresponds to one variable. All predictor variables
in `X`

must be numeric vectors.

For a numeric matrix, the variables that compose the columns of

`X`

must have the same order as the predictor variables that trained`Mdl`

.For a table:

`predict`

does not support multi-column variables and cell arrays other than cell arrays of character vectors.If you trained

`Mdl`

using a table (for example,`Tbl`

), then all predictor variables in`X`

must have the same variable names and data types as those that trained`Mdl`

(stored in`Mdl.PredictorNames`

). However, the column order of`X`

does not need to correspond to the column order of`Tbl`

.`Tbl`

and`X`

can contain additional variables (response variables, observation weights, etc.), but`predict`

ignores them.If you trained

`Mdl`

using a numeric matrix, then the predictor names in`Mdl.PredictorNames`

and corresponding predictor variable names in`X`

must be the same. To specify predictor names during training, see the`PredictorNames`

name-value pair argument of`fitcdiscr`

.`X`

can contain additional variables (response variables, observation weights, etc.), but`predict`

ignores them.

**Data Types: **`table`

| `double`

| `single`

`label`

— Predicted class labelscategorical array | character array | logical vector | vector of numeric values | cell array of character vectors

Predicted class labels, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors.

`label`

:

`score`

— Predicted class posterior probabilitiesnumeric matrix

Predicted class posterior probabilities,
returned as a numeric matrix of size `N`

-by-`K`

. `N`

is
the number of observations (rows) in `X`

, and `K`

is
the number of classes (in `Mdl.ClassNames`

). `score(i,j)`

is
the posterior probability that observation `i`

in `X`

is
of class `j`

in `Mdl.ClassNames`

.

`cost`

— Expected classification costsnumeric matrix

Expected classification
costs, returned as a matrix of size `N`

-by-`K`

. `N`

is
the number of observations (rows) in `X`

, and `K`

is
the number of classes (in `Mdl.ClassNames`

). `cost(i,j)`

is
the cost of classifying row `i`

of `X`

as
class `j`

in `Mdl.ClassNames`

.

Load Fisher's iris data set. Determine the sample size.

```
load fisheriris
N = size(meas,1);
```

Partition the data into training and test sets. Hold out 10% of the data for testing.

rng(1); % For reproducibility cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices

Store the training data in a table.

tblTrn = array2table(meas(idxTrn,:)); tblTrn.Y = species(idxTrn);

Train a discriminant analysis model using the training set and default options.

`Mdl = fitcdiscr(tblTrn,'Y');`

Predict labels for the test set. You trained `Mdl`

using a table of data, but you can predict labels using a matrix.

labels = predict(Mdl,meas(idxTest,:));

Construct a confusion matrix for the test set.

confusionchart(species(idxTest),labels);

`Mdl`

misclassifies one versicolor iris as virginica in the test set.

Load Fisher's iris data set. Consider training using the petal lengths and widths only.

```
load fisheriris
X = meas(:,3:4);
```

Train a quadratic discriminant analysis model using the entire data set.

Mdl = fitcdiscr(X,species,'DiscrimType','quadratic');

Define a grid of values in the observed predictor space. Predict the posterior probabilities for each instance in the grid.

xMax = max(X); xMin = min(X); d = 0.01; [x1Grid,x2Grid] = meshgrid(xMin(1):d:xMax(1),xMin(2):d:xMax(2)); [~,score] = predict(Mdl,[x1Grid(:),x2Grid(:)]); Mdl.ClassNames

`ans = `*3x1 cell array*
{'setosa' }
{'versicolor'}
{'virginica' }

`score`

is a matrix of class posterior probabilities. The columns correspond to the classes in `Mdl.ClassNames`

. For example, `score(j,1)`

is the posterior probability that observation `j`

is a setosa iris.

Plot the posterior probability of versicolor classification for each observation in the grid and plot the training data.

figure; contourf(x1Grid,x2Grid,reshape(score(:,2),size(x1Grid,1),size(x1Grid,2))); h = colorbar; caxis([0 1]); colormap jet; hold on gscatter(X(:,1),X(:,2),species,'mcy','.x+'); axis tight title('Posterior Probability of versicolor'); hold off

The posterior probability region exposes a portion of the decision boundary.

The posterior probability that a point *z* belongs
to class *j* is the product of the prior probability
and the multivariate normal density. The density function of the multivariate
normal with mean *μ _{j}* and
covariance Σ

$$P\left(x|k\right)=\frac{1}{{\left(2\pi \left|{\Sigma}_{k}\right|\right)}^{1/2}}\mathrm{exp}\left(-\frac{1}{2}{\left(x-{\mu}_{k}\right)}^{T}{\Sigma}_{k}^{-1}\left(x-{\mu}_{k}\right)\right),$$

where $$\left|{\Sigma}_{k}\right|$$ is the determinant of Σ* _{k}*,
and $${\Sigma}_{k}^{-1}$$ is the inverse matrix.

Let *P*(*k*) represent the
prior probability of class *k*. Then the posterior
probability that an observation *x* is of class *k* is

$$\widehat{P}\left(k|x\right)=\frac{P\left(x|k\right)P\left(k\right)}{P\left(x\right)},$$

where *P*(*x*) is a normalization
constant, the sum over *k* of *P*(*x*|*k*)*P*(*k*).

The prior probability is one of three choices:

`'uniform'`

— The prior probability of class`k`

is one over the total number of classes.`'empirical'`

— The prior probability of class`k`

is the number of training samples of class`k`

divided by the total number of training samples.Custom — The prior probability of class

`k`

is the`k`

th element of the`prior`

vector. See`fitcdiscr`

.

After creating a classification model (`Mdl`

)
you can set the prior using dot notation:

Mdl.Prior = v;

where `v`

is a vector of positive elements
representing the frequency with which each element occurs. You do
not need to retrain the classifier when you set a new prior.

The matrix of expected costs per observation is defined in Cost.

`predict`

classifies so as to minimize the expected
classification cost:

$$\widehat{y}=\underset{y=1,\mathrm{...},K}{\mathrm{arg}\mathrm{min}}{\displaystyle \sum _{k=1}^{K}\widehat{P}\left(k|x\right)C\left(y|k\right)},$$

where

$$\widehat{y}$$ is the predicted classification.

*K*is the number of classes.$$\widehat{P}\left(k|x\right)$$ is the posterior probability of class

*k*for observation*x*.$$C\left(y|k\right)$$ is the cost of classifying an observation as

*y*when its true class is*k*.

Calculate with arrays that have more rows than fit in memory.

This function fully supports tall arrays. For more information, see Tall Arrays (MATLAB).

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Use

`saveLearnerForCoder`

,`loadLearnerForCoder`

, and`codegen`

to generate code for the`predict`

function. Save a trained model by using`saveLearnerForCoder`

. Define an entry-point function that loads the saved model by using`loadLearnerForCoder`

and calls the`predict`

function. Then use`codegen`

to generate code for the entry-point function.This table contains notes about the arguments of

`predict`

. Arguments not included in this table are fully supported.Argument Notes and Limitations `Mdl`

For the usage notes and limitations of the model object, see Code Generation of the

`CompactClassificationDiscriminant`

object.`X`

Must be a single-precision or double-precision matrix and can be variable-size. However, the number of columns in

`X`

must be`numel(Mdl.PredictorNames)`

.Rows and columns must correspond to observations and predictors, respectively.

For more information, see Introduction to Code Generation.

`ClassificationDiscriminant`

| `CompactClassificationDiscriminant`

| `edge`

| `fitcdiscr`

| `loss`

| `margin`

A modified version of this example exists on your system. Do you want to open this version instead?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)