Main Content

goodnessOfFit

Goodness of fit between test and reference data for analysis and validation of identified models

Description

goodnessOfFit returns fit values that represent the error norm between test and reference data sets. If you want to compare and visualize simulated model output with measurement data, see also compare.

fit = goodnessOfFit(x,xref,cost_func) returns the goodness of fit between the test data x and the reference data xref using the cost function cost_func. fit is a quantitative representation of the closeness of x to xref. To perform multiple test-to-reference fit comparisons, you can specify x and xref as cell arrays of equal size that contain multiple test and reference data sets. With cell array inputs, fit returns an array of fit values.

example

Examples

collapse all

Find the goodness of fit between measured output data and the simulated output of an estimated model.

Obtain the measured output.

load iddata1 z1
yref = z1.y;

z1 is an iddata object containing measured input-output data. z1.y is the measured output.

Estimate a second-order transfer function model and simulate the model output y_est.

sys = tfest(z1,2);
y_est = sim(sys,z1(:,[],:)); 

Calculate the goodness of fit, or error norm, between the measured and estimated outputs. Specify the normalized root mean squared error (NRMSE) as the cost function.

cost_func = 'NRMSE';
y = y_est.y;
fit = goodnessOfFit(y,yref,cost_func) 
fit = 
0.2943

Alternatively, you can use compare to calculate the fit. compare uses the NRMSE cost function, and expresses the fit percentage using the one's complement of the error norm. The fit relationship between compare and goodnessOfFit is therefore fitcompare=(1-fitgof)*100. A compare result of 100% is equivalent to a goodnessOfFit result of 0.

Specify an initial condition of zero to match the initial condition that goodnessOfFit assumes.

opt = compareOptions('InitialCondition','z');
compare(z1,sys,opt);

Figure contains an axes object. The axes object with ylabel y1 contains 2 objects of type line. These objects represent Validation data (y1), sys: 70.57%.

The fit results are equivalent.

Find the goodness of fit between measured and estimated outputs for two models.

Obtain the input-output measurements z2 from iddata2. Copy the measured output into reference output yref.

load iddata2 z2
yref = z2.y;

Estimate second-order and fourth-order transfer function models using z2.

sys2 = tfest(z2,2);
sys4 = tfest(z2,4);

Simulate both systems to get estimated outputs.

y_sim2 = sim(sys2,z2(:,[],:));
y2 = y_sim2.y;
y_sim4 = sim(sys4,z2(:,[],:));
y4 = y_sim4.y;

Create cell arrays from the reference and estimated outputs. The reference data set is the same for both model comparisons, so create identical reference cells.

yrefc = {yref yref};
yc = {y2 y4};

Compute fit values for the three cost functions.

fit_nrmse = goodnessOfFit(yc,yrefc,'NRMSE')
fit_nrmse = 1×2

    0.1429    0.1345

fit_nmse = goodnessOfFit(yc,yrefc,'NMSE')
fit_nmse = 1×2

    0.0204    0.0181

fit_mse = goodnessOfFit(yc,yrefc,'MSE')
fit_mse = 1×2

    1.0811    0.9586

A fit value of 0 indicates a perfect fit between reference and estimated outputs. The fit value rises as fit goodness decreases. For all three cost functions, the fourth-order model produces a better fit than the second-order model.

Input Arguments

collapse all

Data to test, specified as a matrix or cell array.

  • For a single test data set, specify an Ns-by-N matrix, where Ns is the number of samples and N is the number of channels. You must specify cost_fun as 'NRMSE' or 'NMSE' to use multiple-channel data.

  • For multiple test data sets, specify a cell array of length Nd, where Nd is the number of test-to-reference pairs and each cell contains one data matrix.

x must not contain any NaN or Inf values.

Reference data with which to compare x, specified as a matrix or cell array.

  • For a single reference data set, specify an Ns-by-N matrix, where Ns is the number of samples and N is the number of channels. xref must be the same size as x. You must specify cost_fun as 'NRMSE' or 'NMSE' to use multiple-channel data.

  • For multiple reference data sets, specify a cell array of length Nd, where Nd is the number of test-to-reference pairs and each cell contains one reference data matrix. As with the individual data matrices, the cell array sizes for x and xref must match. Each ith element of fit corresponds to the pairs of the ith cells of x and xref.

xref must not contain any NaN or Inf values.

Cost function to determine goodness of fit, specified as one of the following values. In the equations, the value fit applies to a single pairing of test and reference data sets.

ValueDescriptionEquationNotes
'MSE'Mean squared error

fit=xxref2Ns

where Ns is the number of samples and ‖ indicates the 2-norm of a vector.

fit is a scalar.
'NRMSE'Normalized root mean squared error

fit(i)=xref(:,i)x(:,i)xref(:,i)mean(xref(:,i))

where ‖ indicates the 2-norm of a vector. fit is a row vector of length N and i = 1,...,N, where N is the number of channels.

fit is a row vector. 'NRMSE' is the cost function used by compare.

'NMSE'Normalized mean squared error

fit(i)=xref(:,i)x(:,i)2xref(:,i)mean(xref(:,i))2

fit is a row vector.

Output Arguments

collapse all

Goodness of fit between test and reference data set pairs, returned as a scalar, a row vector, or a cell array.

  • For a single test and reference data set pair, fit is returned as a scalar or row vector.

    • If cost_fun is 'MSE', then fit is a scalar.

    • If cost_fun is 'NRMSE' or 'NMSE', then fit is a column vector of length N, where N is the number of channels.

  • For multiple test and reference data set pairs, where x and xref are cell arrays of length ND, fit is returned as a vector or a matrix.

    • If cost_fun is 'MSE', then fit is a row vector of length ND.

    • If cost_fun is 'NRMSE' or 'NMSE', then fit is a matrix of size N-by- Nd, where N is the number of channels (data columns) and Nd represents the number of test pairs.

    Each element of fit contains the goodness of fit values for the corresponding test data and reference pair.

Possible values for individual fit elements depend on the selection of cost_func.

  • If cost_func is 'MSE', each fit value is a positive scalar that grows with the error between test and reference data. A fit value of 0 indicates a perfect match between test and reference data.

  • If cost_func is 'NRMSE' or 'NMSE', fit values vary between -Inf and 1.

    • 0 — Perfect fit to reference data (zero error)

    • -Inf — Bad fit

    • 1x is no better than a straight line at matching xref

Version History

Introduced in R2012a

expand all

See Also

| | |