Validation of External Models

This example uses:

Modelscape for MATLAB Modelscape for MATLAB

This example shows how to validate models existing outside of MATLAB® using Modelscape™ software. An external model can be implemented in Python®, R, SAS®, or it can be a MATLAB model deployed on a web service.

This example covers the use of external models from the point of view of a model validator or other model end user, under the assumption that a model has been deployed to a microservice created using, for example, Flask (Python) or Plumber (R). The example also explains how to implement these microservices and alternative interfaces.

You call an external model from MATLAB to evaluate it with different inputs. You then implement the API for any external model so you can call it from MATLAB.

Call External Model from MATLAB

Set up an interface and call an externally deployed model. This example uses a Python model, but you can customize the example for a model in other programming languages.

Configure External Model

Use the Python code in the Flask Interface section to set up a mock credit scoring model. This model adds noise to the input data, scales it, and returns the value as the output credit score of the applicant. Run the script to make this model available in a test development server. The output shows the URL of the server, such as http://127.0.0.1:5000. Set this URL as the value of ModelURL below. In an actual model validation exercise, the model developer can provide this information as part of the validation request.

ModelURL = "http://127.0.0.1:5000/";

Set up a connection to this model.

extModel = externalModelClient("RootURL",ModelURL)

extModel = 
  ExternalModel with properties:

        InputNames: "income"
    ParameterNames: "weight"
       OutputNames: "score"

The model expects a single input called income and a single parameter called weight. The function returns a single output called score. The model developer should have explained the types and sizes of the model in the model documentation. You can also find this information in the InputDefinition, ParameterDefinition, and OutputDefinition properties of extModel. The sizes property is empty, which indicates that the model expects a scalar.

extModel.InputDefinition

ans = struct with fields:
        name: "income"
       sizes: []
    dataType: [1×1 struct]

extModel.InputDefinition.dataType

ans = struct with fields:
    name: "double"

Evaluate Model

Use the evaluate function of ExternalModel on your model. This function expects two inputs:

The first input must be a table. Each row of the table must consist of the data for a single customer, or a single 'run'. The table is then a 'batch of runs'. The variable names of the table must match the InputNames shown by ExternalModel.
The second input is a struct whose fields match the ParameterNames shown by ExternalModel. The values carried by this struct apply to all the runs in the batch. If the model has no parameters, omit this input.

The primary output is a table whose variable names match the OutputNames shown by ExternalModel. The rows correspond to the runs in the input batch. There may also be run-specific diagnostics consisting of one struct per run and a single batch diagnostic struct.

Use random numbers in the range [0, 100,000] as customer incomes for the input data. For parameters, use a weight of 1.1.

N = 1000;
income = 1e5*rand(N,1);
inputData = table(income,VariableNames="income");
parameters = struct(weight=1.1);

Call your model. The output is a table with the same size as the inputs.

[modelScores, diagnostics, batchDiagnostics] = evaluate(extModel, inputData, parameters);
head(modelScores)

              score 
             _______

    Row_1      102.2
    Row_2      131.1
    Row_3     45.658
    Row_4     24.337
    Row_5    -1.0934
    Row_6     29.777
    Row_7     55.824
    Row_8      98.42

Create a mock response variable by thresholding the income. Validate the scores of the model against this response variable.

defaultIndicators = income < 20000;
aurocMetric = mrm.data.validation.pd.AUROC(defaultIndicators, modelScores.score);
formatResult(aurocMetric)

ans = 
"Area under ROC curve is 0.8462"

visualize(aurocMetric);

The model returns some diagnostics to illustrate their size and shape. The run-specific diagnostics are a single structure with a field for every run.

diagnostics.Row_125

ans = struct with fields:
    noise: -4.9172e+04

In this case, each structure contains a noise term that the model uses to predict the credit score. Batch diagnostics consist of a single structure containing information shared across all runs. In this case, the structure contains the elapsed valuation time at the server side.

batchDiagnostics

batchDiagnostics = struct with fields:
    valuationTime: 0.0020

Extra Arguments

Internally, ExternalModel talks to the model through a REST API. If necessary, modify the headers and HTTP options used in this exchange by passing extra Headers and Options arguments to externalModelClient. The Headers arguments must be matlab.net.http.HeaderField objects. The Options arguments must be matlab.net.http.HTTPOptions objects.

For example, extend the connection timeout to 20 seconds.

options = matlab.net.http.HTTPOptions(ConnectTimeout=20);
extModel = externalModelClient(RootURL=ModelURL,Options=options)

Implement External Model Interface

Implement an API for an external model to call it from MATLAB.

The externalModelClient function creates an mrm.validation.external.ExternalModel object. This object communicates with the external model through a REST API. The object works with any model that implements the API you implement in this example.

Endpoints

The API must implement two endpoints:

/signature must accept a GET request and return a JSON string carrying the information about inputs, parameters, and outputs.
/evaluate must accept a POST request with inputs and parameters in a JSON format and must return a payload containing outputs, diagnostics, and batch diagnostics as a JSON string.

The status code for a successful response must be 200 OK, which is the default in Flask, for example.

Evaluation Inputs

The /evaluate endpoint accepts a payload in this format.

The columns in inputs list the input names, index specifies the row names, data contains the actual input data one row at the time, and parameters records the parameters with their values. The asterisks indicate the values, for example, double or string data.

The inputs datum is compatible with the construction of Pandas DataFrames with split orientation. For more information, see the example implementation in the Flask Interface section.

Response Formats

The /signature endpoint returns a payload in this format:

The /evaluate endpoint returns a payload in this format:

The outputs data are compatible with the JSON output of Pandas dataframes with split orientation.

Work with Alternative APIs

You can make external models available to a model validator in MATLAB when the default API is impossible or inconvenient to implement, for example, when your organization already has a preferred REST API for evaluating models. In this case, implement an API class that inherits from mrm.validation.external.ExternalModelBase. Package this implementation in a +mrm/+validation/+external/ folder on the MATLAB path.

This custom API must populate the InputNames, ParameterNames, and OutputNames properties shown to the validator after an externalModelClient call. The API must also implement the evaluate function, which takes as inputs a table and a structure as in the default API ExternalModel. The custom API must serialize the inputs, manage the REST API calls, and deserialize the outputs into tables and structures, as shown in Evaluate Model.

When you implement a custom API as, say, mrm.validation.external.CustomAPI, the validator can initialize a connection to the model through this client by adding an APIType argument to the externalModelClient call. This API also passes additional arguments to CustomAPI.

extModelNew = externalModelClient("APIType", "CustomAPI", "RootURL", ModelURL)

Flask Interface

This Python code configures the external model you use in the first part of this example.

from flask import Flask, request, jsonify
import pandas as pd
import numpy as np
import time

toyModel = Flask(__name__)

@toyModel.route('/evaluate', methods=['POST'])
def calc():
    start = time.time()
    data = request.get_json()
    inputData = data['inputs']
    inputDF = pd.DataFrame(inputData['data'], columns=inputData['columns'], index=inputData['index'])
    parameters = data['parameters']

    noise = np.random.uniform(low = -50000, high=50000, size=inputDF.shape)

    outDF = inputDF.rename(columns={'income':'score'})
    outDF = outDF.add(noise)
    outDF = outDF.mul(parameters['weight']/1000)
    
    diagnostics = pd.DataFrame(noise, columns=["noise"], index=inputDF.index)
    end = time.time()
    batchDiagnostics = {'valuationTime' : end - start}
    output = {'outputs': outDF.to_json(orient='split'),
              'diagnostics' : diagnostics.to_dict(orient='index'),
              'batchDiagnostics' : batchDiagnostics}
    return output

@toyModel.route('/signature', methods=['GET'])
def getInputs():
    outData = {
        'inputs': [{"name": "income", "dataType": {"name": "double"},"sizes": []}],
        'parameters': [{"name": "weight", "dataType": {"name": "double"}, "sizes": []}],
        'outputs': [{"name": "score", "dataType": {"name": "double"}, "sizes": []}]
        }
    return(jsonify(outData))


if __name__ == '__main__':
    toyModel.run(debug=True, host='0.0.0.0')