loss

Loss for multiresponse regression model

Since R2024b

Syntax

L = loss(Mdl,Tbl)

L = loss(Mdl,X,Y)

L = loss(___,Name=Value)

Description

L = loss(Mdl,Tbl) returns the regression loss, or mean squared error (MSE), for the trained multiresponse regression model Mdl. The function calculates the loss using the predictor data and response variables in table Tbl. For more information, see Loss with Regression Chain Ensembles.

L = loss(Mdl,X,Y) returns the regression loss for the model Mdl using the predictor data X and the response values in Y.

L = loss(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can standardize the response variables.

example

Examples

collapse all

Train Multiresponse Regression Model with Regression Chains

Open Live Script

Create a regression model with more than one response variable by using fitrchains.

Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Displacement, Horsepower, and so on, as well as the response variables Acceleration and MPG. Display the first eight rows of the table.

load carbig
cars = table(Displacement,Horsepower,Model_Year, ...
    Origin,Weight,Acceleration,MPG);
head(cars)

    Displacement    Horsepower    Model_Year    Origin     Weight    Acceleration    MPG
    ____________    __________    __________    _______    ______    ____________    ___

        307            130            70        USA         3504           12        18 
        350            165            70        USA         3693         11.5        15 
        318            150            70        USA         3436           11        18 
        304            150            70        USA         3433           12        16 
        302            140            70        USA         3449         10.5        17 
        429            198            70        USA         4341           10        15 
        454            220            70        USA         4354            9        14 
        440            215            70        USA         4312          8.5        14

Categorize the cars based on whether they were made in the USA.

cars.Origin = categorical(cellstr(cars.Origin));
cars.Origin = mergecats(cars.Origin,["France","Japan",...
    "Germany","Sweden","Italy","England"],"NotUSA");

Partition the data into training and test sets. Use approximately 85% of the observations to train a multiresponse model, and 15% of the observations to test the performance of the trained model on new data. Use cvpartition to partition the data.

rng("default") % For reproducibility
c = cvpartition(height(cars),"Holdout",0.15);
carsTrain = cars(training(c),:);
carsTest = cars(test(c),:);

Train a multiresponse regression model by passing the carsTrain training data to the fitrchains function. By default, the function uses bagged ensembles of trees in the regression chains.

Mdl = fitrchains(carsTrain,["Acceleration","MPG"])

Mdl = 
  RegressionChainEnsemble
           PredictorNames: {'Displacement'  'Horsepower'  'Model_Year'  'Origin'  'Weight'}
             ResponseName: ["Acceleration"    "MPG"]
    CategoricalPredictors: 4
                NumChains: 2
            LearnedChains: {2x2 cell}
          NumObservations: 338

Mdl is a trained RegressionChainEnsemble model object. You can use dot notation to access the properties of Mdl. For example, you can specify Mdl.Learners to see the bagged ensembles used to train the model.

Evaluate the performance of the regression model on the test set by computing the test mean squared error (MSE). Smaller MSE values indicate better performance. Return the loss for each response variable separately by setting the OutputType name-value argument to "per-response".

testMSE = loss(Mdl,carsTest,["Acceleration","MPG"], ...
    OutputType="per-response")

testMSE = 1×2

    2.4921    9.0568

Predict the response values for the observations in the test set. Return the predicted response values as a table.

predictedY = predict(Mdl,carsTest,OutputType="table")

predictedY=60×2 table
    Acceleration     MPG  
    ____________    ______

       12.573       16.109
        10.78       13.988
       11.282       12.963
       15.185       21.066
       12.203       13.773
       13.216       14.216
       17.117       30.199
       16.478       29.033
       13.439       14.208
       11.552       13.066
       13.398       13.271
       14.848       20.927
       16.552       24.603
       12.501       15.359
       15.778       19.328
       12.343       13.185
      ⋮

Input Arguments

collapse all

`Mdl` — Multiresponse regression model
`RegressionChainEnsemble` object | `CompactRegressionChainEnsemble` object

Multiresponse regression model, specified as a RegressionChainEnsemble or CompactRegressionChainEnsemble object.

`Tbl` — Sample data
table

Sample data, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one variable. Tbl must have the same data type as the data used to train Mdl, and must include all predictor and response variables.

Data Types: table

`X` — Predictor data
numeric matrix | table

Predictor data, specified as a numeric matrix or a table. Each row of X corresponds to one observation, and each column corresponds to one predictor. X must have the same data type as the predictor data used to train Mdl, and must contain the same predictors. X and Y must have the same number of observations.

Data Types: single | double | table

`Y` — Response data
numeric matrix | numeric table

Response data, specified as a numeric matrix or table. Each row of Y corresponds to one observation, and each column corresponds to one response variable. X and Y must have the same number of observations.

Data Types: single | double | table

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: loss(Mdl,Tbl,OutputType="per-response") specifies to return a regression loss value for each response variable.

`OutputType` — Type of output loss
`"average"` (default) | `"per-response"`

Type of output loss, specified as "average" or "per-response".

Value	Description
`"average"`	`loss` averages the loss values across all response variables and returns a scalar value.
`"per-response"`	`loss` returns a vector, where each element is the loss for one response variable.

Example: OutputType="per-response"

Data Types: char | string

`StandardizeResponses` — Flag to standardize response data
`false` or `0` (default) | `true` or `1`

Flag to standardize the response data before computing the loss, specified as a numeric or logical 0 (false) or 1 (true). If you set StandardizeResponses to true, the software centers and scales each response variable by the corresponding variable mean and standard deviation in the training data.

Specify StandardizeResponses as true when you have multiple response variables with very different scales and the OutputType is "average".

Example: StandardizeResponses=true

Data Types: single | double | logical

Output Arguments

collapse all

`L` — Regression loss
numeric scalar | numeric vector

Regression loss, or mean squared error (MSE), returned as a numeric scalar or vector.

If OutputType is "average", then loss averages the loss values across all response variables and returns a scalar value.
If OutputType is "per-response", then loss returns a vector, where each element is the loss for one response variable.

For more information, see Loss with Regression Chain Ensembles.

Algorithms

collapse all

Loss with Regression Chain Ensembles

loss computes the mean squared error (MSE) between the true response values (in Tbl or Y) and the predicted response values as returned by the predict object function of Mdl (predictedY). Depending on the value of the OutputType name-value argument, the function averages the loss values across the responses or returns the loss values for each response separately.

For more information on the computation of the predicted response values, see Prediction with Regression Chain Ensembles.

References

[1] Spyromitros-Xioufis, Eleftherios, Grigorios Tsoumakas, William Groves, and Ioannis Vlahavas. "Multi-Target Regression via Input Space Expansion: Treating Targets as Inputs." Machine Learning 104, no. 1 (July 2016): 55–98. https://doi.org/10.1007/s10994-016-5546-z.

Version History

Introduced in R2024b

loss

Syntax

Description

Examples

Train Multiresponse Regression Model with Regression Chains

Input Arguments

Mdl — Multiresponse regression model RegressionChainEnsemble object | CompactRegressionChainEnsemble object

Tbl — Sample data table

X — Predictor data numeric matrix | table

Y — Response data numeric matrix | numeric table

Name-Value Arguments

OutputType — Type of output loss "average" (default) | "per-response"

StandardizeResponses — Flag to standardize response data false or 0 (default) | true or 1

Output Arguments

L — Regression loss numeric scalar | numeric vector

Algorithms

Loss with Regression Chain Ensembles

References

Version History

See Also

`Mdl` — Multiresponse regression model
`RegressionChainEnsemble` object | `CompactRegressionChainEnsemble` object

`Tbl` — Sample data
table

`X` — Predictor data
numeric matrix | table

`Y` — Response data
numeric matrix | numeric table

`OutputType` — Type of output loss
`"average"` (default) | `"per-response"`

`StandardizeResponses` — Flag to standardize response data
`false` or `0` (default) | `true` or `1`

`L` — Regression loss
numeric scalar | numeric vector