# predict

Classify observations using support vector machine (SVM) classifier

## Syntax

``label = predict(SVMModel,X)``
``````[label,score] = predict(SVMModel,X)``````

## Description

example

````label = predict(SVMModel,X)` returns a vector of predicted class labels for the predictor data in the table or matrix `X`, based on the trained support vector machine (SVM) classification model `SVMModel`. The trained SVM model can either be full or compact.```

example

``````[label,score] = predict(SVMModel,X)``` also returns a matrix of scores (`score`) indicating the likelihood that a label comes from a particular class. For SVM, likelihood measures are either classification scores or class posterior probabilities. For each observation in `X`, the predicted class label corresponds to the maximum score among all classes.```

## Examples

collapse all

Load the `ionosphere` data set.

```load ionosphere rng(1); % For reproducibility```

Train an SVM classifier. Specify a 15% holdout sample for testing, standardize the data, and specify that `'g'` is the positive class.

```CVSVMModel = fitcsvm(X,Y,'Holdout',0.15,'ClassNames',{'b','g'},... 'Standardize',true); CompactSVMModel = CVSVMModel.Trained{1}; % Extract trained, compact classifier testInds = test(CVSVMModel.Partition); % Extract the test indices XTest = X(testInds,:); YTest = Y(testInds,:);```

`CVSVMModel` is a `ClassificationPartitionedModel` classifier. It contains the property `Trained`, which is a 1-by-1 cell array holding a `CompactClassificationSVM` classifier that the software trained using the training set.

Label the test sample observations. Display the results for the first 10 observations in the test sample.

```[label,score] = predict(CompactSVMModel,XTest); table(YTest(1:10),label(1:10),score(1:10,2),'VariableNames',... {'TrueLabel','PredictedLabel','Score'})```
```ans=10×3 table TrueLabel PredictedLabel Score _________ ______________ ________ {'b'} {'b'} -1.7175 {'g'} {'g'} 2.0001 {'b'} {'b'} -9.6841 {'g'} {'g'} 2.5614 {'b'} {'b'} -1.5479 {'g'} {'g'} 2.0983 {'b'} {'b'} -2.7013 {'b'} {'b'} -0.66323 {'g'} {'g'} 1.6048 {'g'} {'g'} 1.7731 ```

Label new observations using an SVM classifier.

Load the ionosphere data set. Assume that the last 10 observations become available after you train the SVM classifier.

```load ionosphere rng(1); % For reproducibility n = size(X,1); % Training sample size isInds = 1:(n-10); % In-sample indices oosInds = (n-9):n; % Out-of-sample indices```

Train an SVM classifier. Standardize the data and specify that `'g'` is the positive class. Conserve memory by reducing the size of the trained SVM classifier.

```SVMModel = fitcsvm(X(isInds,:),Y(isInds),'Standardize',true,... 'ClassNames',{'b','g'}); CompactSVMModel = compact(SVMModel); whos('SVMModel','CompactSVMModel')```
``` Name Size Bytes Class Attributes CompactSVMModel 1x1 30314 classreg.learning.classif.CompactClassificationSVM SVMModel 1x1 137414 ClassificationSVM ```

The `CompactClassificationSVM` classifier (`CompactSVMModel`) uses less space than the `ClassificationSVM` classifier (`SVMModel`) because `SVMModel` stores the data.

Estimate the optimal score-to-posterior-probability transformation function.

```CompactSVMModel = fitPosterior(CompactSVMModel,... X(isInds,:),Y(isInds))```
```CompactSVMModel = classreg.learning.classif.CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: '@(S)sigmoid(S,-1.968453e+00,3.121375e-01)' Alpha: [88x1 double] Bias: -0.2143 KernelParameters: [1x1 struct] Mu: [1x34 double] Sigma: [1x34 double] SupportVectors: [88x34 double] SupportVectorLabels: [88x1 double] Properties, Methods ```

The optimal score transformation function (`CompactSVMModel.ScoreTransform`) is the sigmoid function because the classes are inseparable.

Predict the out-of-sample labels and positive class posterior probabilities. Because true labels are available, compare them with the predicted labels.

```[labels,PostProbs] = predict(CompactSVMModel,X(oosInds,:)); table(Y(oosInds),labels,PostProbs(:,2),'VariableNames',... {'TrueLabels','PredictedLabels','PosClassPosterior'})```
```ans=10×3 table TrueLabels PredictedLabels PosClassPosterior __________ _______________ _________________ {'g'} {'g'} 0.98419 {'g'} {'g'} 0.95545 {'g'} {'g'} 0.67794 {'g'} {'g'} 0.94447 {'g'} {'g'} 0.98744 {'g'} {'g'} 0.92481 {'g'} {'g'} 0.9711 {'g'} {'g'} 0.96986 {'g'} {'g'} 0.97803 {'g'} {'g'} 0.94361 ```

`PostProbs` is a 10-by-2 matrix, where the first column is the negative class posterior probabilities, and the second column is the positive class posterior probabilities corresponding to the new observations.

## Input Arguments

collapse all

SVM classification model, specified as a `ClassificationSVM` model object or `CompactClassificationSVM` model object returned by `fitcsvm` or `compact`, respectively.

Predictor data to be classified, specified as a numeric matrix or table.

Each row of `X` corresponds to one observation, and each column corresponds to one variable.

• For a numeric matrix:

• The variables in the columns of `X` must have the same order as the predictor variables that trained `SVMModel`.

• If you trained `SVMModel` using a table (for example, `Tbl`) and `Tbl` contains all numeric predictor variables, then `X` can be a numeric matrix. To treat numeric predictors in `Tbl` as categorical during training, identify categorical predictors by using the `CategoricalPredictors` name-value pair argument of `fitcsvm`. If `Tbl` contains heterogeneous predictor variables (for example, numeric and categorical data types) and `X` is a numeric matrix, then `predict` throws an error.

• For a table:

• `predict` does not support multicolumn variables and cell arrays other than cell arrays of character vectors.

• If you trained `SVMModel` using a table (for example, `Tbl`), then all predictor variables in `X` must have the same variable names and data types as those that trained `SVMModel` (stored in `SVMModel.PredictorNames`). However, the column order of `X` does not need to correspond to the column order of `Tbl`. Also, `Tbl` and `X` can contain additional variables (response variables, observation weights, and so on), but `predict` ignores them.

• If you trained `SVMModel` using a numeric matrix, then the predictor names in `SVMModel.PredictorNames` and corresponding predictor variable names in `X` must be the same. To specify predictor names during training, see the `PredictorNames` name-value pair argument of `fitcsvm`. All predictor variables in `X` must be numeric vectors. `X` can contain additional variables (response variables, observation weights, and so on), but `predict` ignores them.

If you set `'Standardize',true` in `fitcsvm` to train `SVMModel`, then the software standardizes the columns of `X` using the corresponding means in `SVMModel.Mu` and the standard deviations in `SVMModel.Sigma`.

Data Types: `table` | `double` | `single`

## Output Arguments

collapse all

Predicted class labels, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors.

`label` has the same data type as the observed class labels (`Y`) that trained `SVMModel`, and its length is equal to the number of rows in `X`. (The software treats string arrays as cell arrays of character vectors.)

For one-class learning, `label` is the one class represented in the observed class labels.

Predicted class scores or posterior probabilities, returned as a numeric column vector or numeric matrix.

• For one-class learning, `score` is a column vector with the same number of rows as the observations (`X`). The elements of `score` are the positive class scores for the corresponding observations. You cannot obtain posterior probabilities for one-class learning.

• For two-class learning, `score` is a two-column matrix with the same number of rows as `X`.

• If you fit the optimal score-to-posterior-probability transformation function using `fitPosterior` or `fitSVMPosterior`, then `score` contains class posterior probabilities. That is, if the value of `SVMModel.ScoreTransform` is not `none`, then the first and second columns of `score` contain the negative class (`SVMModel.ClassNames{1}`) and positive class (`SVMModel.ClassNames{2}`) posterior probabilities for the corresponding observations, respectively.

• Otherwise, the first column contains the negative class scores and the second column contains the positive class scores for the corresponding observations.

If `SVMModel``.KernelParameters.Function` is `'linear'`, then the classification score for the observation x is

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

`SVMModel` stores β, b, and s in the properties `Beta`, `Bias`, and `KernelParameters``.Scale`, respectively.

To estimate classification scores manually, you must first apply any transformations to the predictor data that were applied during training. Specifically, if you specify `'Standardize',true` when using `fitcsvm`, then you must standardize the predictor data manually by using the mean `SVMModel.Mu` and standard deviation `SVMModel.Sigma`, and then divide the result by the kernel scale in `SVMModel.KernelParameters.Scale`.

All SVM functions, such as `resubPredict` and `predict`, apply any required transformation before estimation.

If `SVMModel``.KernelParameters.Function` is not `'linear'`, then `Beta` is empty (`[]`).

collapse all

### Classification Score

The SVM classification score for classifying observation x is the signed distance from x to the decision boundary ranging from -∞ to +∞. A positive score for a class indicates that x is predicted to be in that class. A negative score indicates otherwise.

The positive class classification score $f\left(x\right)$ is the trained SVM classification function. $f\left(x\right)$ is also the numerical, predicted response for x, or the score for predicting x into the positive class.

`$f\left(x\right)=\sum _{j=1}^{n}{\alpha }_{j}{y}_{j}G\left({x}_{j},x\right)+b,$`

where $\left({\alpha }_{1},...,{\alpha }_{n},b\right)$ are the estimated SVM parameters, $G\left({x}_{j},x\right)$ is the dot product in the predictor space between x and the support vectors, and the sum includes the training set observations. The negative class classification score for x, or the score for predicting x into the negative class, is –f(x).

If G(xj,x) = xjx (the linear kernel), then the score function reduces to

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

s is the kernel scale and β is the vector of fitted linear coefficients.

For more details, see Understanding Support Vector Machines.

### Posterior Probability

The posterior probability is the probability that an observation belongs in a particular class, given the data.

For SVM, the posterior probability is a function of the score P(s) that observation j is in class k = {-1,1}.

• For separable classes, the posterior probability is the step function

`$P\left({s}_{j}\right)=\left\{\begin{array}{l}\begin{array}{cc}0;& s<\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\end{array}\\ \begin{array}{cc}\pi ;& \underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\le {s}_{j}\le \underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\\ \begin{array}{cc}1;& {s}_{j}>\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\end{array},$`

where:

• sj is the score of observation j.

• +1 and –1 denote the positive and negative classes, respectively.

• π is the prior probability that an observation is in the positive class.

• For inseparable classes, the posterior probability is the sigmoid function

`$P\left({s}_{j}\right)=\frac{1}{1+\mathrm{exp}\left(A{s}_{j}+B\right)},$`

where the parameters A and B are the slope and intercept parameters, respectively.

### Prior Probability

The prior probability of a class is the believed relative frequency with which observations from that class occur in a population.

## Tips

• If you are using a linear SVM model for classification and the model has many support vectors, then using `predict` for the prediction method can be slow. To efficiently classify observations based on a linear SVM model, remove the support vectors from the model object by using `discardSupportVectors`.

## Algorithms

• By default and irrespective of the model kernel function, MATLAB® uses the dual representation of the score function to classify observations based on trained SVM models, specifically

`$\stackrel{^}{f}\left(x\right)=\sum _{j=1}^{n}{\stackrel{^}{\alpha }}_{j}{y}_{j}G\left(x,{x}_{j}\right)+\stackrel{^}{b}.$`

This prediction method requires the trained support vectors and α coefficients (see the `SupportVectors` and `Alpha` properties of the SVM model).

• By default, the software computes optimal posterior probabilities using Platt’s method [1]:

1. Perform 10-fold cross-validation.

2. Fit the sigmoid function parameters to the scores returned from the cross-validation.

3. Estimate the posterior probabilities by entering the cross-validation scores into the fitted sigmoid function.

• The software incorporates prior probabilities in the SVM objective function during training.

• For SVM, `predict` and `resubPredict` classify observations into the class yielding the largest score (the largest posterior probability). The software accounts for misclassification costs by applying the average-cost correction before training the classifier. That is, given the class prior vector P, misclassification cost matrix C, and observation weight vector w, the software defines a new vector of observation weights (W) such that

`${W}_{j}={w}_{j}{P}_{j}\sum _{k=1}^{K}{C}_{jk}.$`

## References

[1] Platt, J. “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.” Advances in Large Margin Classifiers. MIT Press, 1999, pages 61–74.