Main Content

risk.validation.areaUnderCurve

Return area under curve

Since R2025a

    Description

    aucValue = risk.validation.areaUnderCurve(Score,BinaryResponse) returns the area under the curve (AUC), where Score contains numeric values that represent quantities such as rankings or predictions, probability of default (PD), or loss given default (LGD) estimates. For example, in credit scoring models, the values in Score can represent individual credit scores or other credit data. BinaryResponse specifies the target state of each value in Score.

    example

    aucValue = risk.validation.areaUnderCurve(Score,BinaryResponse,SortDirection=sortdir) specifies the sorting direction of the unique values in Score.

    [aucValue,Output] = risk.validation.areaUnderCurve(___) also returns a structure Output, that contains a table of metrics with columns Thresholds, TruePositiveRate, and FalsePositiveRate.

    Examples

    collapse all

    Compute the area under the curve of credit scores by using the areaUnderCurve function with a credit validation data set contained in creditValidationData.mat. This data set includes a table, ScorecardValidationData, that contains credit scores and their corresponding default status.

    Load the credit validation data and display the scores.

    load CreditValidationData.mat
    head(ScorecardValidationData)
        CreditScore      PD       Default
        ___________    _______    _______
    
          579.86       0.14182       0   
          563.65       0.17143       0   
          549.52       0.20106       0   
          546.25       0.20845       0   
          485.34       0.37991       0   
          482.07       0.39065       0   
          579.86       0.14182       1   
          451.73         0.494       0   
    

    Extract the variables CreditScore and Default from the table ScorecardValidationData. Use Default as the BinaryResponse input argument.

    Scores = ScorecardValidationData.CreditScore;
    BinaryResponse = ScorecardValidationData.Default;

    Compute the area under the curve of the credit scores by using the areaUnderCurve function with the fully qualified namespace risk.validation. For credit models, you can sort the scores from lower scores to higher scores by setting the SortDirection name-value argument to "ascending". This setting ensures that the function sorts the scores from higher risk individuals to lower risk individuals.

    [aucValue,Output] = risk.validation.areaUnderCurve(Scores,BinaryResponse,SortDirection="ascending")
    aucValue = 
    0.6078
    
    Output = struct with fields:
        AreaUnderCurve: 0.6078
               Metrics: [107×3 table]
    
    

    The output argument aucValue contains the AUC value. Display the metrics Threshold, TruePositiveRate, and FalsePositiveRate contained in the table Output.Metrics.

    head(Output.Metrics)
        Threshold    TruePositiveRate    FalsePositiveRate
        _________    ________________    _________________
    
         408.99                 0                   0     
         408.99          0.071429            0.012821     
         410.12          0.079365            0.017094     
         430.66          0.087302            0.017094     
         435.52          0.087302            0.025641     
         436.65           0.10317            0.029915     
         439.33           0.11905            0.029915     
         440.45           0.13492            0.029915     
    

    Input Arguments

    collapse all

    Score values, specified as a numeric vector, containing values that indicate quantities such as rankings or predictions, PD, or LGD estimates.

    Binary response, specified as a numeric or logical vector, containing values of 1 (true) or 0 (false). The binary response represents the target state for each value in Score. For example, you can use the binary response to represent a discretized LGD target, where ones indicate a high LGD value.

    Sorting direction of the unique values in Score, specified as "descending" or "ascending". If Score contains credit scores, where low values commonly correspond to higher risk individuals, you can set the sorting direction to "ascending". This ensures that TruePositiveRate represents the proportion of defaulters. If Score contains PD values, where higher values correspond to higher risk, sorting the values in descending order is common practice.

    Output Arguments

    collapse all

    Area under the curve, corresponding to the values in Score, returned as a numeric scalar. AUC quantifies how well a model ranks customers by risk levels.

    Output metrics, returned as a structure containing the following fields:

    • AreaUnderCurve — Area under curve

    • Metrics — Table with columns:

      • Thresholds — Unique score values sorted according to the value of sortdir.

      • TruePositiveRate — True positive rate values corresponding to the unique scores in the Thresholds column. For credit scoring models, this corresponds to the proportion of defaulters.

      • FalsePositiveRate — False positive rate values corresponding to the unique scores in the Threshold column. For credit scoring models, this corresponds to the proportion of nondefaulters.

      Metrics contains the data you need to make a Receiver Operating Characteristic (ROC) curve, which plots TruePositiveRate as a function of FalsePositiveRate.

    Version History

    Introduced in R2025a