Main Content

loss

Loss for quantile linear regression model

Since R2024b

    Description

    L = loss(Mdl,Tbl,ResponseVarName) returns the quantile loss for the trained quantile linear regression model Mdl. The function uses the predictor data in the table Tbl and the response values in the ResponseVarName table variable. For more information, see Quantile Loss.

    example

    L = loss(Mdl,Tbl,Y) returns the quantile loss for the model Mdl using the predictor data in the table Tbl and the response values in the vector Y.

    L = loss(Mdl,X,Y) returns the quantile loss for the model Mdl using the predictor data X and the corresponding response values in Y.

    L = loss(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify the quantiles for which to return loss values.

    Examples

    collapse all

    Compute the quantile loss for a quantile linear regression model.

    Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Acceleration, Cylinders, Displacement, and so on, as well as the response variable MPG. View the first eight observations.

    load carbig
    cars = table(Acceleration,Cylinders,Displacement, ...
        Horsepower,Model_Year,Origin,Weight,MPG);
    head(cars)
        Acceleration    Cylinders    Displacement    Horsepower    Model_Year    Origin     Weight    MPG
        ____________    _________    ____________    __________    __________    _______    ______    ___
    
              12            8            307            130            70        USA         3504     18 
            11.5            8            350            165            70        USA         3693     15 
              11            8            318            150            70        USA         3436     18 
              12            8            304            150            70        USA         3433     16 
            10.5            8            302            140            70        USA         3449     17 
              10            8            429            198            70        USA         4341     15 
               9            8            454            220            70        USA         4354     14 
             8.5            8            440            215            70        USA         4312     14 
    

    Remove rows of cars where the table has missing values.

    cars = rmmissing(cars);

    Categorize the cars based on whether they were made in the USA.

    cars.Origin = categorical(cellstr(cars.Origin));
    cars.Origin = mergecats(cars.Origin,["France","Japan",...
        "Germany","Sweden","Italy","England"],"NotUSA");

    Partition the data into training and test sets using cvpartition. Use approximately 80% of the observations as training data, and 20% of the observations as test data.

    rng(0,"twister") % For reproducibility of the data partition
    c = cvpartition(height(cars),"Holdout",0.20);
    
    trainingIdx = training(c);
    carsTrain = cars(trainingIdx,:);
    
    testIdx = test(c);
    carsTest = cars(testIdx,:);

    Train a quantile linear regression model using the carsTrain training data. Specify MPG as the response variable. Then, compute the quantile loss using the carsTest test data.

    Mdl = fitrqlinear(carsTrain,"MPG");
    L = loss(Mdl,carsTest)
    L = 
    2.9448
    

    Retrain the model with a beta tolerance of 1e-6 instead of the default value of 1e-4, and then compute the test set quantile loss.

    newMdl = fitrqlinear(carsTrain,"MPG",BetaTolerance=1e-6);
    newL = loss(newMdl,carsTest)
    newL = 
    1.4050
    

    The retrained model has a lower quantile loss.

    Input Arguments

    collapse all

    Trained quantile linear regression model, specified as a RegressionQuantileLinear model object. You can create a RegressionQuantileLinear model object by using fitrqlinear.

    Sample data, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, Tbl can contain additional columns for the response variable and the observation weights. Tbl must contain all of the predictors used to train Mdl. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

    • If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName or Y.

    • If you trained Mdl using sample data contained in a table, then the input data for loss must also be in a table.

    • If you set Standardize to true in fitrqlinear when training Mdl, then the software standardizes the numeric columns of the predictor data using the corresponding means (Mdl.Mu) and standard deviations (Mdl.Sigma).

    Data Types: table

    Response variable name, specified as the name of a variable in Tbl. The response variable must be a numeric vector.

    You must specify ResponseVarName as a character vector or cell array of character vectors. For example, if Tbl stores the response variable as Tbl.Y, then specify ResponseVarName as "Y". Otherwise, the software treats the Y column of Tbl as a predictor.

    Data Types: char | string

    Response data, specified as a numeric vector. The length of Y must be equal to the number of observations in X or Tbl.

    Data Types: single | double

    Predictor data, specified as a numeric matrix. By default, loss assumes that each row of X corresponds to one observation, and each column corresponds to one predictor variable.

    • X and Y must have the same number of observations.

    • If you set Standardize to true in fitrqlinear when training Mdl, then the software standardizes the numeric columns of the predictor data using the corresponding means (Mdl.Mu) and standard deviations (Mdl.Sigma).

    Note

    If you orient your predictor matrix so that observations correspond to columns and specify ObservationsIn="columns", then you might experience a significant reduction in computation time.

    Data Types: single | double

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: loss(Mdl,Tbl,"Response",Quantiles=[0.25 0.5 0.75]) specifies to compute the quantile loss for the 0.25, 0.5, and 0.75 quantiles.

    Quantiles for which to compute the loss, specified as a vector of values in Mdl.Quantiles. The function returns the loss for each quantile separately.

    Example: Quantiles=[0.4 0.6]

    Data Types: single | double | char | string

    Loss function, specified as "quantile" or a function handle.

    • "quantile" — Quantile loss. For more information, see Quantile Loss.

    • Function handle — To specify a custom loss function, use a function handle. The function must have this form:

      lossval = lossfun(Y,YFit,W,q)

      • The output argument lossval is a numeric scalar.

      • You specify the function name (lossfun).

      • Y is a length-n numeric vector of observed responses, where n is the number of observations in Tbl or X.

      • YFit is a length-n numeric vector of corresponding predicted responses.

      • W is an n-by-1 numeric vector of observation weights.

      • q is a numeric scalar in the range [0,1] corresponding to a quantile.

    Example: LossFun=@lossfun

    Data Types: char | string | function_handle

    Predictor data observation dimension, specified as "rows" or "columns".

    Note

    If you orient your predictor matrix so that observations correspond to columns and specify ObservationsIn="columns", then you might experience a significant reduction in computation time. You cannot specify ObservationsIn="columns" for predictor data in a table.

    Example: ObservationsIn="columns"

    Data Types: char | string

    Observation weights, specified as a nonnegative numeric vector or the name of a variable in Tbl. The software weights each observation in X or Tbl with the corresponding value in Weights. The length of Weights must equal the number of observations in X or Tbl.

    If you specify the input data as a table Tbl, then Weights can be the name of a variable in Tbl that contains a numeric vector. In this case, you must specify Weights as a character vector or string scalar. For example, if the weights vector W is stored as Tbl.W, then specify it as "W".

    By default, Weights is ones(n,1), where n is the number of observations in X or Tbl. If you supply weights, then loss computes the weighted loss and normalizes the weights to sum to 1.

    Data Types: single | double | char | string

    Output Arguments

    collapse all

    Loss, returned as a numeric vector. The type of loss depends on LossFun. Each element in L corresponds to a quantile in Quantiles.

    Algorithms

    collapse all

    Quantile Loss

    The quantile loss L for a specified quantile q (Quantiles) is L=i=1nwiri(qI{ri<0})i=1nwi.

    • n is the number of observations in Tbl or X.

    • wi is the observation weight for observation i (Weights).

    • ri is the residual for observation i (that is, the difference between the true response value and the predicted response value).

    • I{·} is the indicator function.

    Version History

    Introduced in R2024b