Time until first failure test for value-at-risk (VaR) backtesting

generates the time until first failure (TUFF) test for value-at-risk (VaR)
backtesting.`TestResults`

= tuff(`vbt`

)

adds an optional name-value pair argument for
`TestResults`

= tuff(`vbt`

,`Name,Value`

)`TestLevel`

.

The likelihood ratio (test statistic) of the `tuff`

test is
given by

$$LRatioTUFF=-2\mathrm{log}\left(\frac{pVaR{\left(1-pVaR\right)}^{n-1}}{\left(\frac{1}{n}\right){\left(1-\frac{1}{n}\right)}^{n-1}}\right)=-2(\mathrm{log}(pVaR)+(n-1)\mathrm{log}(1-pVaR)+n\mathrm{log}(n)-(n-1)\mathrm{log}(n-1))$$

where *n* is the number of periods until the first failure and
*pVaR* = 1 − *VaRLevel*. By the properties of
the logarithm (if *n* = `1`

),

$$LRatioTUFF=-2\mathrm{log}(pVaR)$$

This is asymptotically distributed as a chi-square distribution with 1 degree of freedom.

The *p*-value of the `tuff`

test is the
probability that a chi-square distribution with 1 degree of freedom exceeds the
likelihood ratio *LRatioTUFF*

$$PValueTUFF=1-F(LRatioTUFF)$$

where `F`

is the cumulative distribution of a chi-square variable
with 1 degree of freedom.

The result of the test is to accept if

$$F(LRatioTUFF)<F(TestLevel)$$

and reject otherwise, where *F* is the cumulative distribution of
a chi-square variable with 1 degree of freedom.

If the sample has no failures, the test statistic is not defined. However, there are two cases distinguished here:

If the number of observations is large enough that no matter when the first failure occurred it would be too late to pass the test, then the model is rejected. Technically, this happens if the number of observations

*N*is larger than`1`

/*pVaR*(large enough relative to the VaR confidence level) and if the test fails when*n*=*N*+`1`

(the earliest observation for the first VaR failure). In this case, the likelihood ratio is reported for*n*=*N*+`1`

, and the corresponding*p*-value.In all other cases, it is not possible to tell with certainty whether the result of the test would eventually be to accept or reject the model. There are ranges of possible first failure values that would result in accepting or rejecting the model. In these cases, the

`tuff`

function accepts the model and reports undefined (`NaN`

) values for the likelihood ratio and*p*-value.

[1] Kupiec, P. "Techniques for Verifying the Accuracy of Risk Management
Models." *Journal of Derivatives.* Vol. 3, 1995, pp.
73–84.