Main Content

report

Generate slice metrics report

Since R2026a

    Description

    metricsTbl = report(sliceResults) generates a table of slice metrics metricsTbl for the data slices in sliceResults. Some metrics directly compare data slices to their complements. The complement of a data slice consists of all observations that are not in the data slice.

    metricsTbl = report(sliceResults,Metrics=metrics) additionally specifies the metrics to include in metricsTbl.

    example

    Examples

    collapse all

    Train a regression model using a mix of numeric and categorical data. Use sliceMetrics to compute metrics on a specified data slice of interest.

    Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Convert the Origin data to a categorical variable, and combine the variable with a subset of the other measurements into a table.

    load carbig
    Origin = categorical(cellstr(Origin));
    cars = table(Acceleration,Cylinders,Displacement,Horsepower, ...
        Origin,Weight,MPG);

    Remove observations with missing values from the cars table. Then, display the first eight observations in the table.

    cars = rmmissing(cars);
    head(cars)
        Acceleration    Cylinders    Displacement    Horsepower    Origin    Weight    MPG
        ____________    _________    ____________    __________    ______    ______    ___
    
              12            8            307            130         USA       3504     18 
            11.5            8            350            165         USA       3693     15 
              11            8            318            150         USA       3436     18 
              12            8            304            150         USA       3433     16 
            10.5            8            302            140         USA       3449     17 
              10            8            429            198         USA       4341     15 
               9            8            454            220         USA       4354     14 
             8.5            8            440            215         USA       4312     14 
    

    Partition the data into training data and test data. Reserve approximately 50% of the observations for computing slice metrics, and use the rest of the observations for model training.

    rng(0,"twister") % For reproducibility
    cv = cvpartition(length(cars.MPG),Holdout=0.5);
    trainingCars = cars(training(cv),:);
    testCars = cars(test(cv),:);

    Train a Gaussian process regression model using the training data. Standardize the numeric predictors before fitting the model.

    Mdl = fitrgp(trainingCars,"MPG",Standardize=true)
    Mdl = 
      RegressionGP
               PredictorNames: {'Acceleration'  'Cylinders'  'Displacement'  'Horsepower'  'Origin'  'Weight'}
                 ResponseName: 'MPG'
        CategoricalPredictors: 5
            ResponseTransform: 'none'
              NumObservations: 196
               KernelFunction: 'SquaredExponential'
            KernelInformation: [1×1 struct]
                BasisFunction: 'Constant'
                         Beta: 25.8166
                        Sigma: 3.9677
            PredictorLocation: [11×1 double]
               PredictorScale: [11×1 double]
                        Alpha: [196×1 double]
             ActiveSetVectors: [196×11 double]
                PredictMethod: 'Exact'
                ActiveSetSize: 196
                    FitMethod: 'Exact'
              ActiveSetMethod: 'Random'
            IsActiveSetVector: [196×1 logical]
                LogLikelihood: -566.1334
             ActiveSetHistory: []
               BCDInformation: []
    
    
      Properties, Methods
    
    

    Mdl is a RegressionGP model object trained on a mix of numeric and categorical predictors.

    Create a data slice of the test set cars manufactured in the USA with an acceleration value of 15 or more.

    testDataSliceIndex = testCars.Origin=="USA" & testCars.Acceleration >= 15;

    Evaluate the regression model on the custom data slice using the sliceMetrics function. Use the report function to display the mean squared error (MSE), a two-sample t-statistic, and a two-sample p-value for the custom data slice (true) and its complement (false).

    sliceResults = sliceMetrics(Mdl,testCars,testDataSliceIndex);
    metricsTbl = report(sliceResults,Metrics=["mse","tstat","pvalue"])
    metricsTbl=2×5 table
        custom    NumObservations    Error     TStatistic     PValue 
        ______    _______________    ______    __________    ________
    
        true             61           9.624     -1.8661      0.063547
        false           135          16.036      1.8661      0.063547
    
    

    The MSE is smaller for the custom data slice than for the remaining test set observations. However, the t-statistic and p-value for Welch's t-test indicate that the mean of the squared errors is not statistically different at the 5% significance level between the slice and its complement.

    Input Arguments

    collapse all

    Slice metrics results, specified as a sliceMetrics object.

    Metrics to display, specified as a character vector or string scalar containing one metric name, or a string array or cell array of character vectors containing multiple metric names. The following tables describe the supported metrics. The default is ["accuracy","oddsratio","pvalue","effect"] for classification models and ["error","tstat","pvalue","effect"] for regression models.

    Classification Metrics

    ValueDescription
    "TruePositives" or "tp"Number of true positives (TP)
    "FalseNegatives" or "fn"Number of false negatives (FN)
    "FalsePositives" or "fp"Number of false positives (FP)
    "TrueNegatives" or "tn"Number of true negatives (TN)
    "SumOfTrueAndFalsePositives" or "tp+fp"Sum of TP and FP
    "RateOfPositivePredictions" or "rpp"Rate of positive predictions (RPP), (TP+FP)/(TP+FN+FP+TN)
    "RateOfNegativePredictions" or "rnp"Rate of negative predictions (RNP), (TN+FN)/(TP+FN+FP+TN)
    "FalseNegativeRate", "fnr", or "miss"False negative rate (FNR), or miss rate, FN/(TP+FN)
    "TrueNegativeRate", "tnr", or "spec"True negative rate (TNR), or specificity, TN/(TN+FP)
    "PositivePredictiveValue", "ppv", "prec", or "precision"Positive predictive value (PPV), or precision, TP/(TP+FP)
    "NegativePredictiveValue" or "npv"Negative predictive value (NPV), TN/(TN+FN)
    "Accuracy", "accu", or "accuracy"Accuracy, (TP+TN)/(TP+FN+FP+TN)
    "F1Score" or "f1score"F1 score, 2*TP/(2*TP+FP+FN)
    "OddsRatio" or "oddsratio"

    Odds ratio, which is numSliceIncorrect/numSliceCorrect divided by numCompIncorrect/numCompCorrect

    • numSliceIncorrect is the number of misclassified observations in the data slice.

    • numSliceCorrect is the number of correctly classified observations in the data slice.

    • numCompIncorrect is the number of misclassified observations in the complement of the data slice.

    • numCompCorrect is the number of correctly classified observations in the complement of the data slice.

    "PValue" or "pvalue"p-value for the test of the null hypothesis that there is no association between slice membership and error rate (odds ratio = 1), against the alternative hypothesis that there is an association (odds ratio ≠ 1). The software uses Fisher's exact test for small counts (see fishertest) and the Chi-squared test otherwise.
    "EffectSize" or "effect"Mean-difference effect size for the test of the null hypothesis that there is no association between slice membership and error rate (odds ratio = 1), against the alternative hypothesis that there is an association (odds ratio ≠ 1) (see meanEffectSize)
    "TStatistic" or "tstat"t-statistic for Welch's t-test of the slice error rate against the slice complement error rate (see ttest2)

    Regression Metrics

    ValueDescription
    "Error" or "error"Mean squared error (MSE)
    "TStatistic" or "tstat"t-statistic for Welch's t-test of the slice error against the slice complement error (see ttest2)
    "PValue" or "pvalue"p-value for Welch's t-test of the slice error against the slice complement error (see ttest2)
    "EffectSize" or "effect"Mean-difference effect size for the slice error against the slice complement error (see meanEffectSize)

    Data Types: char | string | cell

    Output Arguments

    collapse all

    Slice metrics, returned as a table. Each row corresponds to one slice of data. The table columns are (in order): one column for each slice variable, one column for the number of observations in each data slice, and one column for each slice metric.

    References

    [1] Chung, Yeounoh, Tim Kraska, Neoklis Polyzotis, Ki Hyun Tae, and Steven Euijong Whang. “Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach.” IEEE Transactions on Knowledge and Data Engineering 32, no. 12 (2020): 2284–96.

    Version History

    Introduced in R2026a

    See Also

    |