Tobit

Create Tobit model object for loss given default

Since R2021a

expand all in page

Description

Create and analyze a Tobit model object to calculate loss given default (LGD) using this workflow:

Use fitLGDModel to create a Tobit model object.
Use predict to predict the LGD.
Use modelDiscrimination to return AUROC and ROC data. You can plot the results using modelDiscriminationPlot.
Use modelCalibration to return the R-squared, RMSE, correlation, and sample mean error of predicted and observed LGD data. You can plot the results using modelCalibrationPlot.

Creation

Syntax

TobitLGDModel = fitLGDModel(data,ModelType)

TobitLGDModel = fitLGDModel(___,Name,Value)

Description

TobitLGDModel = fitLGDModel(data,ModelType) creates a Tobit LGD model object.

example

TobitLGDModel = fitLGDModel(___,Name,Value) specifies options using one or more name-value arguments in addition to the input arguments in the previous syntax. The optional name-value arguments set the model object properties. For example, lgdModel = fitLGDModel(data,'tobit',PredictorVars={'LTV' 'Age' 'Type'},ResponseVar="LGD",CensoringSide="left",LeftLimit=1e-4,WeightsVar="Weights") creates a lgdModel object using a Tobit model type.

example

Input Arguments

expand all

`data` — Data for loss given default
table

Data for loss given default, specified as a table.

Data Types: table

`ModelType` — Model type
string with value `"Tobit"` | character vector with value `'Tobit'`

Model type, specified as a string with the value of "Tobit" or a character vector with the value of 'Tobit'.

Data Types: char | string

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: lgdModel = fitLGDModel(data,'tobit',PredictorVars={'LTV' 'Age' 'Type'},ResponseVar="LGD",CensoringSide="left",LeftLimit=1e-4)

`ModelID` — User-defined model ID
`"Tobit"` (default) | string | character vector

User-defined model ID, specified as the comma-separated pair consisting of 'ModelID' and a string or character vector. The software uses the ModelID text to format outputs and is expected to be short.

Data Types: string | char

`Description` — User-defined description for model
`""` (default) | string | character vector

User-defined description for model, specified as the comma-separated pair consisting of 'Description' and a string or character vector.

Data Types: string | char

`PredictorVars` — Predictor variables
all columns of `data` except for `ResponseVar` (default) | string array | cell array of character vectors

Predictor variables, specified as the comma-separated pair consisting of 'PredictorVars' and a string array or cell array of character vectors. PredictorVars indicates which columns in the data input contain the predictor information. By default, PredictorVars is set to all the columns in the data input except for ResponseVar.

Data Types: string | cell

`ResponseVar` — Response variable
last column of `data` (default) | string | character vector

Response variable, specified as the comma-separated pair consisting of 'ResponseVar' and a string or character vector. The response variable contains the LGD data and must be a numeric variable. An LGD value of 0 indicates no loss (full recovery), 1 indicates total loss (no recovery), and values between 0 and 1 indicate a partial loss. By default, ResponseVar is set to the last column.

Data Types: string | char

`CensoringSide` — Censoring side
`"both"` (default) | character vector with value of `'left'`, `'right'`, or `'both'` | string with value of `"left"`, `"right"`, or `"both"`

Censoring side, specified as the comma-separated pair consisting of 'CensoringSide' and a character vector or string. CensoringSide indicates whether the desired Tobit model is left-censored, right-censored, or censored on both sides.

Data Types: string | char

`LeftLimit` — Left-censoring limit
`0` (default) | numeric between `0` and `1`

Left-censoring limit, specified as the comma-separated pair consisting of 'LeftLimit' and a scalar numeric between 0 and 1.

Data Types: double

`RightLimit` — Right-censoring limit
`1` (default) | numeric between `0` and `1`

Right-censoring limit, specified as the comma-separated pair consisting of 'RightLimit' and a scalar numeric between 0 and 1.

Data Types: double

`SolverOptions` — `optimoptions` object
object

Options for fitting, specified as the comma-separated pair consisting of 'SolverOptions' and an optimoptions object that is created using optimoptions from Optimization Toolbox™. The defaults for the optimoptions object are:

"Display" — "none"
"Algorithm" — "sqp"
"MaxFunctionEvaluations" — 500 ✕ Number of model coefficients
"MaxIterations" — The number of Tobit model coefficients is determined at run time, it depends on the number of predictors and the number of categories in the categorical predictors.

Note

When using optimoptions with a Tobit model, specify the SolverName as fmincon.

Data Types: object

`WeightsVar` — Column name containing weights
`""` (default) | string scalar

Column name of the input table containing weights, specified as a string scalar.

Note

The default value ("") results in a weight of 1 for each row in data. All weight values in data must be nonnegative.

For an example using WeightsVar, see Create Weighted LGD Model.

Properties

expand all

`ModelID` — User-defined model ID
`Tobit` (default) | string

User-defined model ID, returned as a string.

Data Types: string

`Description` — User-defined description
`""` (default) | string

User-defined description, returned as a string.

Data Types: string

`UnderlyingModel` — Underlying statistical model
compact linear model

This property is read-only.

Underlying statistical model, returned as a compact linear model object. The compact version of the underlying regression model is an instance of the classreg.regr.CompactLinearModel class. For more information, see fitlm and CompactLinearModel.

Data Types: CompactLinearModel

`PredictorVars` — Predictor variables
all columns of `data` except for the `ResponseVar` (default) | string array

Predictor variables, returned as a string array.

Data Types: string

`ResponseVar` — Response variable
last column of `data` (default) | string

Response variable, returned as a string.

Data Types: string

`CensoringSide` — Censoring side
`"both"` (default) | string with value of `"left"`, `"right"`, or `"both"`

This property is read-only.

Censoring side, returned as a string.

Data Types: string

`LeftLimit` — Left-censoring limit
`0` (default) | numeric between `0` and `1`

This property is read-only.

Left-censoring limit, returned as a scalar numeric between 0 and 1.

Data Types: double

`RightLimit` — Right-censoring limit
`1` (default) | numeric between `0` and `1`

This property is read-only.

Right-censoring limit, returned as a scalar numeric between 0 and 1.

Data Types: double

`WeightsVar` — Column name containing weights
`""` (default) | string scalar

Column name of the input table containing weights, returned as a string scalar. This property is also used to determine the weights variable for validation data when you use the modelDiscrimination or modelCalibration functions.

Object Functions

`predict`	Predict loss given default
`modelDiscrimination`	Compute AUROC and ROC data
`modelDiscriminationPlot`	Plot ROC curve
`modelCalibration`	Compute R-square, RMSE, correlation, and sample mean error of predicted and observed LGDs
`modelCalibrationPlot`	Scatter plot of predicted and observed LGDs

Examples

collapse all

Create Tobit LGD Model

Open Live Script

This example shows how to use fitLGDModel to create a Tobit model for loss given default (LGD).

Load LGD Data

Load the LGD data.

load LGDData.mat
head(data)

      LTV        Age         Type           LGD   
    _______    _______    ___________    _________

    0.89101    0.39716    residential     0.032659
    0.70176     2.0939    residential      0.43564
    0.72078     2.7948    residential    0.0064766
    0.37013      1.237    residential     0.007947
    0.36492     2.5818    residential            0
      0.796     1.5957    residential      0.14572
    0.60203     1.1599    residential     0.025688
    0.92005    0.50253    investment      0.063182

rng('default');
NumObs = height(data);
c = cvpartition(NumObs,'HoldOut',0.4);
TrainingInd = training(c);
TestInd = test(c);

Create Tobit LGD Model

Use fitLGDModel to create a Tobit model using the TrainingInd data.

lgdModel = fitLGDModel(data(TrainingInd,:),'Tobit',...
   'ModelID','Example Tobit',...
   'PredictorVars',{'LTV' 'Age' 'Type'},...
   'ResponseVar','LGD',...
   'CensoringSide','left',...
   'LeftLimit',1e-4);
disp(lgdModel)

  Tobit with properties:

      CensoringSide: "left"
          LeftLimit: 1.0000e-04
         RightLimit: 1
            Weights: [0x1 double]
            ModelID: "Example Tobit"
        Description: ""
    UnderlyingModel: [1x1 risk.internal.credit.TobitModel]
      PredictorVars: ["LTV"    "Age"    "Type"]
        ResponseVar: "LGD"
         WeightsVar: ""

Display the underlying model. The underlying model is a left-censored Tobit model. Use the 'CensoringSide' argument and the 'LeftLimit' and 'RightLimit' arguments to modify the underlying Tobit model.

disp(lgdModel.UnderlyingModel)

Tobit regression model, left-censored:
     LGD = max(0.0001,Y*)
     Y* ~ 1 + LTV + Age + Type

Estimated coefficients:
                       Estimate       SE         tStat       pValue  
                       ________    _________    _______    __________

    (Intercept)        0.057356     0.026585     2.1575      0.031083
    LTV                  0.2003     0.030596     6.5464    7.3912e-11
    Age                -0.09405    0.0072999    -12.884             0
    Type_investment     0.10071     0.017922     5.6193    2.1732e-08
    (Sigma)             0.28833    0.0055224     52.211             0

Number of observations: 2093
Number of left-censored observations: 547
Number of uncensored observations: 1546
Number of right-censored observations: 0
Log-likelihood: -638.353

Predict LGD

For Tobit models, use predict to calculate the predicted LGD value, which is the unconditional expected value of the response, given the predictor values.

predictedLGD = predict(lgdModel,data(TestInd,:))

predictedLGD = 1394×1

    0.0871
    0.1228
    0.3181
    0.0926
    0.1654
    0.2215
    0.2347
    0.0102
    0.1576
    0.1969
      ⋮

Validate LGD Model

Use modelDiscriminationPlot to plot the ROC curve.

modelDiscriminationPlot(lgdModel,data(TestInd,:))

Figure contains an axes object. The axes object with title ROC Example Tobit, AUROC = 0.67989, xlabel False Positive Rate, ylabel True Positive Rate contains an object of type line. This object represents Example Tobit.

Use modelCalibrationPlot to show a scatter plot of the predictions.

modelCalibrationPlot(lgdModel,data(TestInd,:))

Figure contains an axes object. The axes object with title Scatter Example Tobit, R-Squared: 0.0855, xlabel LGD Predicted, ylabel LGD Observed contains 2 objects of type scatter, line. These objects represent Data, Fit.

More About

expand all

Loss Given Default Tobit Models

The loss given default (LGD) Tobit models fit a Tobit model to LGD data.

Tobit models are “censored” regression models. Tobit models assume that the response variable can be observed only within certain limits, and no value outside the limits can be observed. In the case of LGD models, the limits are typically 0 (total recovery or cure) and 1 (total loss). A distribution of response values where there is a high frequency of observations at the limits is consistent with the model assumptions. For LGD models, it is common to have distributions with a high proportion of cures, or high proportion of total losses, or both.

The Tobit model combines the following two formulas:

$\begin{array}{l} Y = \min {\max {L, Y^{*}}, R} \\ Y^{*} = β_{0} + β_{1} X_{1} + ... + β_{p} X_{p} + σ ε = X β + σ ε \end{array}$

where

Y is the observed response variable, the observed LGD data for an LGD model.
L is the left limit, the lower bound for the response values, typically 0 for LGD models.
R is the right limit, the upper bound for the response values, typically 1 for LGD models.
Y^* is a latent, unobserved variable.
β_j is the coefficient of the jth predictor (or the intercept for j = 0).
σ is the standard deviation of the error term.
ε is the error term, assumed to follow a standard normal distribution.

The first formula above is written using min and max operators and is equivalent to

$Y = {\begin{cases} L if Y^{*} \leq L \\ Y^{*} if L < Y^{*} < R \\ R if Y^{*} \geq R \end{cases}}$

The standard deviation of the error is explicitly indicated in the formulas. Unlike traditional regression least-squares estimation, where the standard deviation of the error can be inferred from the residuals, for Tobit models the estimation is via maximum likelihood and the standard deviation needs to be handled explicitly during the estimation. If there are p predictor variables, the Tobit model estimates p+2 coefficients, namely, one coefficient for each predictor, plus an intercept, plus a standard deviation.

Three censoring side options are supported in the Tobit LGD models with the CensoringSide name-value argument:

'both' — This is the default option, with censoring on both sides. The estimation uses left and right limits.
'left' — The left-censored version of the model has no right limit (or R = ∞). The relationship between Y and Y^* is Y = max⁡{L,Y^* }.
'right' — The right-censored version of the model has no left limit (or L = -∞). The relationship between Y and Y^* is Y = min⁡{Y^*,R}.

The parameters of the Tobit model are estimated using maximum likelihood. For observation i = 1,…,n, the likelihood function is

$L F (β, σ | X_{i}, Y_{i}) = {\begin{cases} Φ (L; X_{i} β, σ) if Y_{i} \leq L \\ ϕ (Y_{i} {;X}_{i} β, σ) if L < Y_{i} < R \\ 1 - Φ (R; X_{i} β, σ) if Y_{i} \geq R \end{cases}}$

where

$Φ$ (x;m,s) is the cumulative normal distribution with mean m and standard deviation s.
$ϕ$ (x;m,s) is the normal density function with mean m and standard deviation s.

This likelihood function is for models censored on both sides. For left-censored models, the right limit has no effect, and the likelihood function has two cases only (R = ∞); likewise for right-censored models (L = -∞).

The log-likelihood function is the sum of the logarithm of the likelihood functions for individual observations

$L L F (β, σ | X, Y) = \sum_{i = 1}^{n} \log (L F (β, σ | X_{i}, Y_{i}))$

The parameters are estimated by maximizing the log-likelihood function. The only constraint is that the σ parameter must be positive.

To predict an LGD value, Tobit LGD models return the unconditional expected value of the response, given the predictor values

$L G D_{i}^{p r e d} = E [Y_{i} | X_{i}]$

The expression for the expected value can be separated into the cases

$\begin{array}{l} E [Y] = E [Y | Y = L] P (Y = L) \\ + E [Y | L < Y < R] P (L < Y < R) \\ + E [Y | Y = R] P (Y = R) \end{array}$

Using the previous expression and the properties of the (truncated) normal distribution, it follows that

$E [Y_{i} | X_{i}] = Φ (a_{i}) L + (Φ (b_{i}) - Φ (a_{i})) (X_{i} β + σ λ_{i}) + (1 - Φ (b_{i})) R$

where

$a_{i} = \frac{L - X_{i} β}{σ}, b_{i} = \frac{R - X_{i} β}{σ}, and λ_{i} = \frac{ϕ (a_{i}) - ϕ (b_{i})}{Φ (b_{i}) - Φ (a_{i})}$

This expression applies to the models censored on both sides. For models censored on one side only, the corresponding expressions can be derived from here. For example, for left-censored models, let the R limit in the expression above go to infinity, and the resulting expression is

$E [Y_{i} | X_{i}] = Φ (a_{i}) L + (1 - Φ (a_{i})) (X_{i} β + σ \frac{ϕ (a_{i})}{1 - Φ (a_{i})})$

Similarly, for right-censored models, the L limit is decreased to minus infinity to get

$E [Y_{i} | X_{i}] = Φ (b_{i}) (X_{i} β - σ \frac{ϕ (b_{i})}{Φ (b_{i})}) + (1 - Φ (b_{i})) R$

References

[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.

[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.

Version History

Introduced in R2021a

expand all

R2024a: Added `WeightsVar` name-value argument for `Tobit` model

The Tobit model supports a WeightsVar name-value argument for observation weights.

R2023a: `modelAccuracy` object function is renamed to `modelCalibration` function

The modelAccuracy object function is renamed to modelCalibration function. The use of modelAccuracy is discouraged, use modelCalibration instead.

R2023a: `modelAccuracyPlot` object function is renamed to `modelCalibrationPlot` function

The modelAccuracyPlot object function is renamed to modelCalibrationPlot function. The use of modelAccuracyPlot is discouraged, use modelCalibrationPlot instead.

Tobit

Description

Creation

Syntax

Description

Input Arguments

`data` — Data for loss given default
table

`ModelType` — Model type
string with value `"Tobit"` | character vector with value `'Tobit'`

`ModelID` — User-defined model ID
`"Tobit"` (default) | string | character vector

`Description` — User-defined description for model
`""` (default) | string | character vector

`PredictorVars` — Predictor variables
all columns of `data` except for `ResponseVar` (default) | string array | cell array of character vectors

`ResponseVar` — Response variable
last column of `data` (default) | string | character vector

`CensoringSide` — Censoring side
`"both"` (default) | character vector with value of `'left'`, `'right'`, or `'both'` | string with value of `"left"`, `"right"`, or `"both"`

`LeftLimit` — Left-censoring limit
`0` (default) | numeric between `0` and `1`

`RightLimit` — Right-censoring limit
`1` (default) | numeric between `0` and `1`

`SolverOptions` — `optimoptions` object
object

`WeightsVar` — Column name containing weights
`""` (default) | string scalar

Properties

`ModelID` — User-defined model ID
`Tobit` (default) | string

`Description` — User-defined description
`""` (default) | string

`UnderlyingModel` — Underlying statistical model
compact linear model

`PredictorVars` — Predictor variables
all columns of `data` except for the `ResponseVar` (default) | string array

`ResponseVar` — Response variable
last column of `data` (default) | string

`CensoringSide` — Censoring side
`"both"` (default) | string with value of `"left"`, `"right"`, or `"both"`

`LeftLimit` — Left-censoring limit
`0` (default) | numeric between `0` and `1`

`RightLimit` — Right-censoring limit
`1` (default) | numeric between `0` and `1`

`WeightsVar` — Column name containing weights
`""` (default) | string scalar

Object Functions

Examples

Create Tobit LGD Model

More About

Loss Given Default Tobit Models

References

Version History

R2024a: Added `WeightsVar` name-value argument for `Tobit` model

R2023a: `modelAccuracy` object function is renamed to `modelCalibration` function

R2023a: `modelAccuracyPlot` object function is renamed to `modelCalibrationPlot` function

See Also

Functions

Topics

Tobit

Description

Creation

Syntax

Description

Input Arguments

data — Data for loss given default table

ModelType — Model type string with value "Tobit" | character vector with value 'Tobit'

ModelID — User-defined model ID "Tobit" (default) | string | character vector

Description — User-defined description for model "" (default) | string | character vector

PredictorVars — Predictor variables all columns of data except for ResponseVar (default) | string array | cell array of character vectors

ResponseVar — Response variable last column of data (default) | string | character vector

CensoringSide — Censoring side "both" (default) | character vector with value of 'left', 'right', or 'both' | string with value of "left", "right", or "both"

LeftLimit — Left-censoring limit 0 (default) | numeric between 0 and 1

RightLimit — Right-censoring limit 1 (default) | numeric between 0 and 1

SolverOptions — optimoptions object object

WeightsVar — Column name containing weights "" (default) | string scalar

Properties

ModelID — User-defined model ID Tobit (default) | string

Description — User-defined description "" (default) | string

UnderlyingModel — Underlying statistical model compact linear model

PredictorVars — Predictor variables all columns of data except for the ResponseVar (default) | string array

ResponseVar — Response variable last column of data (default) | string

CensoringSide — Censoring side "both" (default) | string with value of "left", "right", or "both"

LeftLimit — Left-censoring limit 0 (default) | numeric between 0 and 1

RightLimit — Right-censoring limit 1 (default) | numeric between 0 and 1

WeightsVar — Column name containing weights "" (default) | string scalar

Object Functions

Examples

Create Tobit LGD Model

More About

Loss Given Default Tobit Models

References

Version History

R2024a: Added WeightsVar name-value argument for Tobit model

R2023a: modelAccuracy object function is renamed to modelCalibration function

R2023a: modelAccuracyPlot object function is renamed to modelCalibrationPlot function

See Also

Functions

Topics

`data` — Data for loss given default
table

`ModelType` — Model type
string with value `"Tobit"` | character vector with value `'Tobit'`

`ModelID` — User-defined model ID
`"Tobit"` (default) | string | character vector

`Description` — User-defined description for model
`""` (default) | string | character vector

`PredictorVars` — Predictor variables
all columns of `data` except for `ResponseVar` (default) | string array | cell array of character vectors

`ResponseVar` — Response variable
last column of `data` (default) | string | character vector

`CensoringSide` — Censoring side
`"both"` (default) | character vector with value of `'left'`, `'right'`, or `'both'` | string with value of `"left"`, `"right"`, or `"both"`

`LeftLimit` — Left-censoring limit
`0` (default) | numeric between `0` and `1`

`RightLimit` — Right-censoring limit
`1` (default) | numeric between `0` and `1`

`SolverOptions` — `optimoptions` object
object

`WeightsVar` — Column name containing weights
`""` (default) | string scalar

`ModelID` — User-defined model ID
`Tobit` (default) | string

`Description` — User-defined description
`""` (default) | string

`UnderlyingModel` — Underlying statistical model
compact linear model

`PredictorVars` — Predictor variables
all columns of `data` except for the `ResponseVar` (default) | string array

`ResponseVar` — Response variable
last column of `data` (default) | string

`CensoringSide` — Censoring side
`"both"` (default) | string with value of `"left"`, `"right"`, or `"both"`

`LeftLimit` — Left-censoring limit
`0` (default) | numeric between `0` and `1`

`RightLimit` — Right-censoring limit
`1` (default) | numeric between `0` and `1`

`WeightsVar` — Column name containing weights
`""` (default) | string scalar

R2024a: Added `WeightsVar` name-value argument for `Tobit` model

R2023a: `modelAccuracy` object function is renamed to `modelCalibration` function

R2023a: `modelAccuracyPlot` object function is renamed to `modelCalibrationPlot` function