kfoldPredict

Predict responses for observations in cross-validated quantile regression model

Since R2025a

Syntax

predictedY = kfoldPredict(CVMdl)

predictedY = kfoldPredict(CVMdl,Name=Value)

[predictedY,crossingIndicator] = kfoldPredict(___)

Description

predictedY = kfoldPredict(CVMdl) returns responses predicted by the cross-validated quantile regression model CVMdl. For every fold, kfoldPredict predicts the responses for validation-fold observations using a model trained on training-fold observations. CVMdl.X and CVMdl.Y contain both sets of observations.

example

predictedY = kfoldPredict(CVMdl,Name=Value) specifies additional options using one or more name-value arguments. For example, you can specify the quantiles for which to return predictions.

[predictedY,crossingIndicator] = kfoldPredict(___) additionally returns a matrix crossingIndicator whose entries indicate whether predictions for the specified quantiles cross each other.

example

Examples

collapse all

Predict Using Cross-Validated Quantile Regression Model

Open Live Script

Create a cross-validated quantile regression model. Compare the predicted response values to the true response values.

Simulate 1000 observations from the model $y = 1 + 0.05 x + \sin (x) / x + ϵ$ where:

x is a 1000-by-1 vector of evenly spaced values between –10 and 10.
$ϵ$ is a 1000-by-1 vector of random normal errors with mean 0 and standard deviation 0.2.

rng("default"); % For reproducibility
n = 1000;
x = linspace(-10,10,n)';
y = 1 + 0.05*x + sin(x)./x + 0.2*randn(n,1);

Create a 5-fold cross-validated quantile neural network regression model. Use the default quantile value, which corresponds to the median.

CVMdl = fitrqnet(x,y,KFold=5)

CVMdl = 
  RegressionPartitionedQuantileModel
    CrossValidatedModel: 'QuantileNeuralNetwork'
         PredictorNames: {'x1'}
           ResponseName: 'Y'
        NumObservations: 1000
                  KFold: 5
              Partition: [1×1 cvpartition]
      ResponseTransform: 'none'
              Quantiles: 0.5000


  Properties, Methods

CVMdl is a RegressionPartitionedQuantileModel object that contains five trained CompactRegressionQuantileNeuralNetwork model objects (CVMdl.Trained). Each of the five models is trained using approximately 4/5 of the observations in x.

Predict the median response values using the cross-validated quantile regression model. The predicted response values are the predictions on the holdout (validation) observations. In other words, the software obtains each prediction by using a model that was trained without the corresponding observation.

predictedY = kfoldPredict(CVMdl);

Plot the true response values and the predicted response values for the cross-validated model.

plot(x,y,".");
hold on
plot(x,predictedY,".");
xlabel("x")
ylabel("y")
title("Cross-Validation Predictions")
legend(["True","Predicted"])
hold off

Figure contains an axes object. The axes object with title Cross-Validation Predictions, xlabel x, ylabel y contains 2 objects of type line. One or more of the lines displays its values using only markers These objects represent True, Predicted.

The five CompactRegressionQuantileNeuralNetwork models seem generally to agree, but the predictions differ slightly in the predictor data range from 0 to 10.

You cannot use the cross-validated model directly to make predictions on new data. If you want to predict response values for a new data set, you can train a new quantile regression model using all the data in x and then use the predict object function. For example, predict response values for each even integer between –10 and 10.

Mdl = fitrqnet(x,y);
xnew = (-10:2:10)';
predictedNew = predict(Mdl,xnew)

predictedNew = 11×1

    0.6360
    0.6340
    0.6320
    0.6300
    1.3421
    2.0209
    1.5462
    0.9962
    1.2118
    1.4273
    1.6429
      ⋮

Alternatively, you can use the individual compact models in the Trained property of the cross-validated model and then combine the predictions (for example, through averaging). For example, predict average response values for each even integer between –10 and 10.

predictions = zeros(length(xnew),CVMdl.KFold);
for i = 1:CVMdl.KFold
    predictions(:,i) = predict(CVMdl.Trained{i},xnew);
end
averagePredictions = mean(predictions,2)

averagePredictions = 11×1

    0.6399
    0.6332
    0.6264
    0.6215
    1.3391
    2.0521
    1.5724
    0.9853
    1.2277
    1.4360
    1.6341
      ⋮

Find Folds with Crossing Quantile Predictions

Open Live Script

Create a cross-validated quantile regression model. Find the test folds that contain observations whose predictions cross each other.

Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Acceleration, Cylinders, Displacement, and so on, as well as the response variable MPG.

load carbig
cars = table(Acceleration,Cylinders,Displacement, ...
    Horsepower,Model_Year,Origin,Weight,MPG);

Categorize the cars based on whether they were made in the USA.

cars.Origin = categorical(cellstr(cars.Origin));
cars.Origin = mergecats(cars.Origin,["France","Japan",...
    "Germany","Sweden","Italy","England"],"NotUSA");

Train a cross-validated quantile neural network regression model. Use the 0.25, 0.50, and 0.75 quantiles (that is, the lower quartile, median, and upper quartile). To improve the model fit, standardize the numeric predictors before training. Use a 3-fold cross-validation.

rng(0,"twister") % For reproducibility
CVMdl = fitrqnet(cars,"MPG",Quantiles=[0.25 0.5 0.75], ...
    Standardize=true,KFold=3);

CVMdl is a RegressionPartitionedQuantileModel object.

Determine if any of the predictions for the quantiles in Mdl.Quantiles cross each other by using kfoldPredict. The crossingIndicator output argument contains a value of 1 (true) for any observation with quantile predictions that cross.

[~,crossingIndicator] = kfoldPredict(CVMdl);
sum(crossingIndicator)

ans = 
3

In this example, eight of the observations in cars have quantile predictions that cross each other.

Find the test sets that contain the eight observations.

idx = test(CVMdl.Partition,"all");
observations = idx(crossingIndicator,:)

observations = 3×3 logical array

   1   0   0
   1   0   0
   1   0   0

The majority of the eight observations are in the first test set. Therefore, most of the quantile crossings in CVMdl are produced by the first compact model in the object (CVMdl.Trained{1}), because it provides the predictions for the observations in the first test set.

Input Arguments

collapse all

`CVMdl` — Cross-validated quantile regression model
`RegressionPartitionedQuantileModel` object

Cross-validated quantile regression model, specified as a RegressionPartitionedQuantileModel object.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: kfoldPredict(CVMdl,Quantiles=0.5,PredictionForMissingValue=NaN) specifies to return the predictions for the 0.5 quantile (median) and use NaN predictions for observations that have missing predictor values.

`Quantiles` — Quantiles for which to compute predictions
`"all"` (default) | vector of values in `CVMdl.Quantiles`

Quantiles for which to compute predictions, specified as a vector of values in Mdl.Quantiles. The software returns predictions only for the quantiles specified in Quantiles.

Example: Quantiles=[0.4 0.6]

Data Types: single | double | char | string

`PredictionForMissingValue` — Predicted response value to use for observations with missing predictor values
`"quantile"` (default) | numeric scalar | numeric vector

Predicted response value to use for observations with missing predictor values, specified as "quantile", a numeric scalar, or a numeric vector.

Value Description

"quantile" kfoldPredict uses the specified quantile of the observed response values in the training-fold data as the predicted response value for observations with missing predictor values.

Numeric scalar or vector

Value	Description
`"quantile"`	`kfoldPredict` uses the specified quantile of the observed response values in the training-fold data as the predicted response value for observations with missing predictor values.
Numeric scalar or vector	If `PredictionForMissingValue` is a scalar, then `kfoldPredict` uses the value as the predicted response value for observations with missing predictor values. The function uses the same value for all quantiles. If `PredictionForMissingValue` is a vector, its length must be equal to the number of quantiles specified by the `Quantiles` name-value argument. `kfoldPredict` uses element i in the vector as the quantile i predicted response value for observations with missing predictor values.

If PredictionForMissingValue is a scalar, then kfoldPredict uses the value as the predicted response value for observations with missing predictor values. The function uses the same value for all quantiles.
If PredictionForMissingValue is a vector, its length must be equal to the number of quantiles specified by the Quantiles name-value argument. kfoldPredict uses element i in the vector as the quantile i predicted response value for observations with missing predictor values.

Example: PredictionForMissingValue=NaN

Data Types: single | double | char | string

Output Arguments

collapse all

`predictedY` — Predicted response
numeric matrix

Predicted response, returned as an n-by-q numeric vector, where n is the number of observations in CVMdl.X and q in the number of quantiles specified by the Quantiles name-value argument.

If you use a holdout validation technique to create CVMdl (that is, if CVMdl.KFold is 1), then predictedY has NaN values for training-fold observations.

`crossingIndicator` — Quantile crossing indicator
logical vector

Quantile crossing indicator, returned as a logical vector. Each entry corresponds to an observation in CVMdl.X. A value of 1 (true) indicates that the corresponding observation has predictions that cross each other. That is, two quantiles q1 and q2 exist in Quantiles such that q1 < q2 and predictedY_q1 > predictedY_q2.

Algorithms

kfoldPredict computes predictions according to the predict object function of the trained compact models in CVMdl (CVMdl.Trained). For more information, see the model-specific predict function reference pages in the following table.

Model Type	`predict` Function
Quantile linear regression model	`predict`
Quantile neural network model for regression	`predict`

Version History

Introduced in R2025a

kfoldPredict

Syntax

Description

Examples

Predict Using Cross-Validated Quantile Regression Model

Find Folds with Crossing Quantile Predictions

Input Arguments

CVMdl — Cross-validated quantile regression model RegressionPartitionedQuantileModel object

Name-Value Arguments

Quantiles — Quantiles for which to compute predictions "all" (default) | vector of values in CVMdl.Quantiles

PredictionForMissingValue — Predicted response value to use for observations with missing predictor values "quantile" (default) | numeric scalar | numeric vector

Output Arguments

predictedY — Predicted response numeric matrix

crossingIndicator — Quantile crossing indicator logical vector

Algorithms

Version History

See Also

`CVMdl` — Cross-validated quantile regression model
`RegressionPartitionedQuantileModel` object

`Quantiles` — Quantiles for which to compute predictions
`"all"` (default) | vector of values in `CVMdl.Quantiles`

`PredictionForMissingValue` — Predicted response value to use for observations with missing predictor values
`"quantile"` (default) | numeric scalar | numeric vector

`predictedY` — Predicted response
numeric matrix

`crossingIndicator` — Quantile crossing indicator
logical vector