A univariate time series *y** _{t}* is

An *n*-dimensional time series *y** _{t}* is

Cointegration is distinguished from traditional economic equilibrium, in which a balance of forces produces stable long-term levels in the variables. Cointegrated variables are generally unstable in their levels, but exhibit mean-reverting "spreads" (generalized by the cointegrating relation) that force the variables to move around common stochastic trends. Cointegration is also distinguished from the short-term synchronies of positive covariance, which only measures the tendency to move together at each time step. Modification of the VAR model to include cointegrated variables balances the short-term dynamics of the system with long-term tendencies.

The tendency of cointegrated variables to revert to common stochastic
trends is expressed in terms of *error-correction*.
If *y** _{t}* is
an

In general, there may be multiple cointegrating relations among
the variables in *y** _{t}*,
in which case the vectors

$$\Delta {y}_{t}=C{y}_{t-1}+{\displaystyle \sum _{i=1}^{q}{B}_{i}\Delta {y}_{t-i}}+{\epsilon}_{t}.$$

If the variables in *y** _{t}* are
all

By collecting differences, a VEC(*q*) model
can be converted to a VAR(*p*) model in levels, with *p* = *q*+1:

$${y}_{t}={A}_{1}{y}_{t-1}+\mathrm{...}+{A}_{p}{y}_{t-p}+{\epsilon}_{t}.$$

Conversion between VEC(*q*)
and VAR(*p*) representations of an *n*-dimensional
system are carried out by the functions `vectovar`

and `vartovec`

using
the formulas:

$$\begin{array}{l}{A}_{1}=C+{I}_{n}\text{+}{B}_{1}\hfill \\ {A}_{i}={B}_{i}-{B}_{i-1}\text{,}i=2,\mathrm{...},q\hfill \\ {A}_{p}=-{B}_{q}\hfill \end{array}\}\text{VEC(}q\text{)toVAR(}p=q+1)\text{(using}vectovar\text{)}$$

$$\begin{array}{l}C={\displaystyle \sum _{i=1}^{p}{A}_{i}-{I}_{n}}\hfill \\ {B}_{i}=-{\displaystyle \sum _{j=i+1}^{p}{A}_{j}}\hfill \end{array}\}\text{VAR(}p\text{)toVEC(}q=p-1\text{)(using}vartovec\text{)}$$

Because of the equivalence of the
two representations, a VEC model with a reduced-rank error-correction
coefficient is often called a *cointegrated VAR model*.
In particular, cointegrated VAR models can be simulated and forecast
using standard VAR techniques.

The cointegrated VAR model is often augmented with exogenous
terms *D*** x**:

$$\Delta {y}_{t}=A{B}^{\prime}{y}_{t-1}+{\displaystyle \sum _{i=1}^{q}{B}_{i}\Delta {y}_{t-i}}+Dx+{\epsilon}_{t}.$$

Variables in ** x** may include
seasonal or interventional dummies, or deterministic terms representing
trends in the data. Since the model is expressed in differences ∆

Case | Form of AB′y_{t − 1} + Dx | Model Interpretation |

H2 | AB′y_{t − 1} | There are no intercepts or trends in the cointegrating relations and there are no trends in the data. This model is only appropriate if all series have zero mean. |

H1* | A(By_{t − 1} + c_{0}) | There are intercepts in the cointegrating relations and there are no trends in the data. This model is appropriate for nontrending data with nonzero mean. |

H1 | A(B′y_{t − 1}+c_{0}) + c_{1} | There are intercepts in the cointegrating relations and there
are linear trends in the data. This is a model of deterministic
cointegration, where the cointegrating relations eliminate
both stochastic and deterministic trends in the data. |

H* | A(B′y_{t − 1} + c_{0} + d_{0}t) + c_{1} | There are intercepts and linear trends in the cointegrating
relations and there are linear trends in the data. This is a model
of stochastic cointegration, where the cointegrating
relations eliminate stochastic but not deterministic trends in the
data. |

H | A(B′y_{t − 1} + c_{0} + d_{0}t) + c_{1} + d_{1}t | There are intercepts and linear trends in the cointegrating relations and there are quadratic trends in the data. Unless quadratic trends are actually present in the data, this model may produce good in-sample fits but poor out-of-sample forecasts. |

In Econometrics Toolbox™, deterministic terms outside of
the cointegrating relations, *c*_{1} and *d*_{1},
are identified by projecting constant and linear regression coefficients,
respectively, onto the orthogonal complement of *A*.

Integration and cointegration both present opportunities for transforming variables to stationarity. Integrated variables, identified by unit root and stationarity tests, can be differenced to stationarity. Cointegrated variables, identified by cointegration tests, can be combined to form new, stationary variables. In practice, it must be determined if such transformations lead to more reliable models, with variables that retain an economic interpretation.

Generalizing from the univariate case can be misleading. In the standard Box-Jenkins [15] approach to univariate ARMA modeling, stationarity is an essential assumption. Without it, the underlying distribution theory and estimation techniques become invalid. In the corresponding multivariate case, where the VAR model is unrestricted and there is no cointegration, choices are less straightforward. If the goal of a VAR analysis is to determine relationships among the original variables, differencing loses information. In this context, Sims, Stock, and Watson [89] advise against differencing, even in the presence of unit roots. If, however, the goal is to simulate an underlying data-generating process, integrated levels data can cause a number of problems. Model specification tests lose power due to an increase in the number of estimated parameters. Other tests, such as those for Granger causality, no longer have standard distributions, and become invalid. Finally, forecasts over long time horizons suffer from inconsistent estimates, due to impulse responses that do not decay. Enders [32] discusses modeling strategies.

In the presence of cointegration, simple differencing is a model misspecification, since long-term information appears in the levels. Fortunately, the cointegrated VAR model provides intermediate options, between differences and levels, by mixing them together with the cointegrating relations. Since all terms of the cointegrated VAR model are stationary, problems with unit roots are eliminated.

Cointegration modeling is often suggested, independently, by economic theory. Examples of variables that are commonly described with a cointegrated VAR model include:

Money stock, interest rates, income, and prices (common models of money demand)

Investment, income, and consumption (common models of productivity)

Consumption and long-term income expectation (Permanent Income Hypothesis)

Exchange rates and prices in foreign and domestic markets (Purchasing Power Parity)

Spot and forward currency exchange rates and interest rates (Covered Interest Rate Parity)

Interest rates of different maturities (Term Structure Expectations Hypothesis)

Interest rates and inflation (Fisher Equation)

Since these theories describe long-term equilibria among the variables, accurate estimation of cointegrated models may require large amounts of low-frequency (annual, quarterly, monthly) macroeconomic data. As a result, these models must consider the possibility of structural changes in the underlying data-generating process during the sample period.

Financial data, by contrast, is often available at high frequencies (hours, minutes, microseconds). The mean-reverting spreads of cointegrated financial series can be modeled and examined for arbitrage opportunities. For example, the Law of One Price suggests cointegration among the following groups of variables:

Prices of assets with identical cash flows

Prices of assets and dividends

Spot, future, and forward prices

Bid and ask prices

Modern approaches to cointegration testing originated with Engle
and Granger [34].
Their method is simple to describe: regress the first component *y*_{1t} of *y*_{t} on
the remaining components of *y*_{t} and
test the residuals for a unit root. The null hypothesis is that the
series in *y*_{t} are *not* cointegrated,
so if the residual test fails to find evidence against the null of
a unit root, the Engle-Granger test fails to find evidence that the
estimated regression relation is cointegrating. Note that you can
write the regression equation as $${y}_{1t}-{b}_{1}{y}_{2t}-\mathrm{...}-{b}_{d}{y}_{dt}-{c}_{0}=\beta \prime {y}_{t}-{c}_{0}={\epsilon}_{t}$$,
where $$\beta =[\begin{array}{cc}1& -b\prime \end{array}]\prime $$ is the cointegrating vector
and *c*_{0} is the intercept.
A complication of the Engle-Granger approach is that the residual
series is estimated rather than observed, so the standard asymptotic
distributions of conventional unit root statistics do not apply. Augmented
Dickey-Fuller tests (`adftest`

)
and Phillips-Perron tests (`pptest`

)
can not be used directly. For accurate testing, distributions of the
test statistics must be computed specifically for the Engle-Granger
test.

The Engle-Granger test is implemented in Econometrics Toolbox by
the function `egcitest`

. To demonstrate
its use, load MacKinnon's data [70] on
the term-structure of Canadian interest rates:

load Data_Canada Y = Data(:,3:end); % Interest rate data figure plot(dates,Y,'LineWidth',2) xlabel 'Year'; ylabel 'Percent'; names = series(3:end); legend(names,'location','NW') title '{\bf Canadian Interest Rates, 1954-1994}'; axis tight grid on

The plot shows evidence of cointegration among the three series,
which move together with a mean-reverting spread. To test for cointegration,
we compute both the *τ* (`t1`

)
and *z* (`t2`

) Dickey-Fuller statistics,
which `egcitest`

compares to tabulated values of
the Engle-Granger critical values:

[h,pValue,stat,cValue] = egcitest(Y,'test',{'t1','t2'})

h = 0 1 pValue = 0.0526 0.0202 stat = -3.9321 -25.4538 cValue = -3.9563 -22.1153

The *τ* test fails to reject the null
of no cointegration, but just barely, with a *p*-value
only slightly above the default 5% significance level, and a statistic
only slightly above the left-tail critical value. The *z* test
does reject the null of no cointegration.

The test regresses `y1 = Y(:,1)`

on ```
Y2
= Y(:,2:end)
```

and (by default) an intercept `c0`

.
The residual series is `[y1 Y2]*beta`

– `c0`

= `y1`

– ` Y2*b`

– `c0`

. Regression
coefficients `c0`

and `b`

are returned
in a fifth output argument (together with other regression statistics).
You can use the regression coefficients to examine the hypothesized
cointegrating vector `beta = [1; -b]`

:

[~,~,~,~,reg] = egcitest(Y,'test','t2'); c0 = reg.coeff(1); b = reg.coeff(2:3); beta = [1;-b]; h = gca; COrd = h.ColorOrder; h.NextPlot = 'ReplaceChildren'; h.ColorOrder = circshift(COrd,3); plot(dates,Y*beta-c0,'LineWidth',2); title '{\bf Cointegrating Relation}'; axis tight; legend off; grid on;

The combination appears relatively stationary, as the test confirms.

Once a cointegrating relation has been determined, remaining VEC model coefficients can be estimated by ordinary least-squares. Suppose, for example, that a model selection procedure has indicated the adequacy of *q* = 2 lags in a VEC(*q*) model, and we wish to estimate:

Since `c0`

and
= `[1; -b]`

have already been determined, we conditionally estimate
, `B1`

, `B2`

, and `c1`

by first forming the required lagged differences before performing the regression:

load Data_Canada Y = Data(:,3:end); % Interest rate data [~,~,~,~,reg] = egcitest(Y,'test','t2'); c0 = reg.coeff(1); b = reg.coeff(2:3); beta = [1;-b]; q = 2; [numObs,numDims] = size(Y); tBase = (q+2):numObs; % Commensurate time base, all lags T = length(tBase); % Effective sample size YLags = lagmatrix(Y,0:(q+1)); % Y(t-k) on observed time base LY = YLags(tBase,(numDims+1):2*numDims); % Y(t-1) on commensurate time base % Form multidimensional differences so that % the kth numDims-wide block of % columns in DelatYLags contains (1-L)Y(t-k+1): DeltaYLags = zeros(T,(q+1)*numDims); for k = 1:(q+1) DeltaYLags(:,((k-1)*numDims+1):k*numDims) = ... YLags(tBase,((k-1)*numDims+1):k*numDims) ... - YLags(tBase,(k*numDims+1):(k+1)*numDims); end DY = DeltaYLags(:,1:numDims); % (1-L)Y(t) DLY = DeltaYLags(:,(numDims+1):end); % [(1-L)Y(t-1),...,(1-L)Y(t-q)] % Perform the regression: X = [(LY*beta-c0),DLY,ones(T,1)]; P = (X\DY)'; % [alpha,B1,...,Bq,c1] alpha = P(:,1); B1 = P(:,2:4); B2 = P(:,5:7); c1 = P(:,end); % Display model coefficients alpha,b,c0,B1,B2,c1

alpha = -0.6336 0.0595 0.0269 b = 2.2209 -1.0718 c0 = -1.2393 B1 = 0.1649 -0.1465 -0.0416 -0.0024 0.3816 -0.3716 0.0815 0.1790 -0.1528 B2 = -0.3205 0.9506 -0.9514 -0.1996 0.5169 -0.5211 -0.1751 0.6061 -0.5419 c1 = 0.1516 0.1508 0.1503

We also estimate the residual covariance matrix for purposes of simulation and forecasting:

res = DY-X*P'; EstCov = cov(res);

Once model coefficients have been estimated, the underlying data-generating process can be simulated. For example, the following code generates a single Monte Carlo forecast path for a horizon 10 years beyond the data:

load Data_Canada Y = Data(:,3:end); % Interest rate data [~,~,~,~,reg] = egcitest(Y,'test','t2'); c0 = reg.coeff(1); b = reg.coeff(2:3); beta = [1; -b]; q = 2; [numObs,numDims] = size(Y); tBase = (q+2):numObs; % Commensurate time base, all lags T = length(tBase); % Effective sample size DeltaYLags = zeros(T,(q+1)*numDims); YLags = lagmatrix(Y,0:(q+1)); % Y(t-k) on observed time base LY = YLags(tBase,(numDims+1):2*numDims); for k = 1:(q+1) DeltaYLags(:,((k-1)*numDims+1):k*numDims) = ... YLags(tBase,((k-1)*numDims+1):k*numDims) ... - YLags(tBase,(k*numDims+1):(k+1)*numDims); end DY = DeltaYLags(:,1:numDims); % (1-L)Y(t) DLY = DeltaYLags(:,(numDims+1):end); % [(1-L)Y(t-1),...,(1-L)Y(t-q)] X = [(LY*beta-c0),DLY,ones(T,1)]; P = (X\DY)'; % [alpha,B1,...,Bq,c1] alpha = P(:,1); B1 = P(:,2:4); B2 = P(:,5:7); c1 = P(:,end); res = DY-X*P'; EstCov = cov(res); numSteps = 10; % Preallocate: YSim = zeros(numSteps,numDims); eps = zeros(numSteps,numDims); % Specify q+1 presample values: YSim(1,:) = Y(end-2,:); YSim(2,:) = Y(end-1,:); YSim(3,:) = Y(end,:); % Simulate numSteps postsample values: rng('default'); % For reproducibility for t = 4:numSteps+3 eps(t,:) = mvnrnd([0 0 0],EstCov,1); % Normal innovations YSim(t,:) = YSim(t-1,:) ... + YSim(t-1,:)*beta*alpha'... + (YSim(t-1,:)-YSim(t-2,:))*B1'... + (YSim(t-2,:)-YSim(t-3,:))*B2'... + (alpha*c0 + c1)'... + eps(t,:); end % Plot sample and forecast path: plot(dates,Y,'LineWidth',2) xlabel('Year') ylabel('Percent') title('{\bf Forecast Path}') hold on D = dates(end); plot(D:(D+numSteps),YSim(3:end,:),'-.','LineWidth',2) Ym = min([Y(:);YSim(:)]); YM = max([Y(:);YSim(:)]); fill([D D D+numSteps D+numSteps],[Ym YM YM Ym],'b','FaceAlpha',0.1) axis tight grid on hold off

As described in VAR Model Forecasting, Simulation, and Analysis, the mean and standard deviation of multiple
realizations of the forecast path can be used to generate mean forecasts
with confidence intervals. Alternatively, the VEC model can be converted
to a VAR representation using `vectovar`

. `vgxpred`

and `vgxsim`

can
be used to generate forecasts.

The Engle-Granger method has several limitations. First of all, it identifies only a single cointegrating relation, among what might be many such relations. This requires one of the variables, , to be identified as "first" among the variables in . This choice, which is usually arbitrary, affects both test results and model estimation. To see this, permute the three interest rates in the Canadian data and estimate the cointegrating relation for each choice of a "first" variable:

load Data_Canada Y = Data(:,3:end); % Interest rate data P0 = perms([1 2 3]); [~,idx] = unique(P0(:,1)); % Rows of P0 with unique regressand y1 P = P0(idx,:); % Unique regressions numPerms = size(P,1); % Preallocate: T0 = size(Y,1); H = zeros(1,numPerms); PVal = zeros(1,numPerms); CIR = zeros(T0,numPerms); % Run all tests: for i = 1:numPerms YPerm = Y(:,P(i,:)); [h,pValue,~,~,reg] = egcitest(YPerm,'test','t2'); H(i) = h; PVal(i) = pValue; c0i = reg.coeff(1); bi = reg.coeff(2:3); betai = [1;-bi] CIR(:,i) = YPerm*betai-c0i; end % Display the test results: H,PVal

betai = 1.0000 -2.2209 1.0718 betai = 1.0000 -0.6029 -0.3472 betai = 1.0000 -1.4394 0.4001 H = 1 1 0 PVal = 0.0202 0.0290 0.0625

For this data, two regressands identify cointegration while the third regressand fails to do so. Asymptotic theory indicates that the test results will be identical in large samples, but the finite-sample properties of the test make it cumbersome to draw reliable inferences.

A plot of the identified cointegrating relations shows the previous estimate (Cointegrating relation 1), plus two others. There is no guarantee, in the context of Engle-Granger estimation, that the relations are independent: Plot the cointegrating relations:

h = gca; COrd = h.ColorOrder; h.NextPlot = 'ReplaceChildren'; h.ColorOrder = circshift(COrd,3); plot(dates,CIR,'LineWidth',2) title('{\bf Multiple Cointegrating Relations}') legend(strcat({'Cointegrating relation '}, ... num2str((1:numPerms)')),'location','NW'); axis tight grid on

Another limitation of the Engle-Granger method is that it is a two-step procedure, with one regression to estimate the residual series, and another regression to test for a unit root. Errors in the first estimation are necessarily carried into the second estimation. The estimated, rather than observed, residual series requires entirely new tables of critical values for standard unit root tests.

Finally, the Engle-Granger method estimates cointegrating relations independently of the VEC model in which they play a role. As a result, model estimation also becomes a two-step procedure. In particular, deterministic terms in the VEC model must be estimated conditionally, based on a predetermined estimate of the cointegrating vector.

The Johansen test for cointegration addresses many of the limitations
of the Engle-Granger method. It avoids two-step estimators and provides
comprehensive testing in the presence of multiple cointegrating relations.
Its maximum likelihood approach incorporates the testing procedure
into the process of model estimation, avoiding conditional estimates.
Moreover, the test provides a framework for testing restrictions on
the cointegrating relations *B* and the adjustment
speeds *A* in the VEC model.

At the core of the Johansen method is the relationship between
the rank of the impact matrix *C* = *AB*′
and the size of its eigenvalues. The eigenvalues depend on the form
of the VEC model, and in particular on the composition of its deterministic
terms (see The Role of Deterministic Terms). The method infers the cointegration
rank by testing the number of eigenvalues that are statistically different
from 0, then conducts model estimation under the rank constraints.
Although the method appears to be very different from the Engle-Granger
method, it is essentially a multivariate generalization of the augmented
Dickey-Fuller test for unit roots. See, e.g., [32].

The Johansen test is implemented in Econometrics Toolbox by
the function `jcitest`

. To demonstrate
its use, we return to the data on the term-structure of Canadian interest
rates. The function's calling syntax, and the structure of
its output arguments, are best illustrated by running multiple tests
in a single function call. Here, for example, we test for the cointegration
rank using the default H1 model with two different lag structures:

load Data_Canada Y = Data(:,3:end); % Interest rate data [h,pValue,stat,cValue] = jcitest(Y,'model','H1','lags',1:2);

************************ Results Summary (Test 1) Data: Y Effective sample size: 39 Model: H1 Lags: 1 Statistic: trace Significance level: 0.05 r h stat cValue pValue eigVal ---------------------------------------- 0 1 35.3442 29.7976 0.0104 0.3979 1 1 15.5568 15.4948 0.0490 0.2757 2 0 2.9796 3.8415 0.0843 0.0736 ************************ Results Summary (Test 2) Data: Y Effective sample size: 38 Model: H1 Lags: 2 Statistic: trace Significance level: 0.05 r h stat cValue pValue eigVal ---------------------------------------- 0 0 25.8188 29.7976 0.1346 0.2839 1 0 13.1267 15.4948 0.1109 0.2377 2 0 2.8108 3.8415 0.0937 0.0713

The default "trace" test assesses null hypotheses *H*(*r*)
of cointegration rank less than or equal to *r* against
the alternative *H*(*n*), where *n* is
the dimension of the data. The summaries show that the first test
rejects a cointegration rank of 0 (no cointegration) and just barely
rejects a cointegration rank of 1, but fails to reject a cointegration
rank of 2. The inference is that the data exhibit 1 or 2 cointegrating
relationships. With an additional lag in the model, the second test
fails to reject any of the cointegration ranks, providing little by
way of inference. This example illustrates the importance of determining
a reasonable lag length for the VEC model (as well as the general
form of the model) before testing for cointegration.

Because the Johansen method, by its nature, tests multiple rank
specifications for each specification of the remaining model parameters,
results from `jcitest`

are returned in the form
of tabular arrays, indexed by null rank and test number. For example,
the output `h`

has the form:

h

h = r0 r1 r2 _____ _____ _____ t1 true true false t2 false false false

Column headers indicate tests `r0`

, `r1`

,
and `r2`

, respectively, of *H*(0), *H*(1),
and *H*(2) against *H*(3). Row headers `t1`

and `t2`

indicate
the two separate tests (two separate lag structures) specified by
the input parameters. To access, for example, the result for the second
test at null rank *r* = 0,
use tabular array indexing:

h20 = h.r0(2)

h20 = 0

In addition to testing for multiple cointegrating relations, `jcitest`

produces maximum likelihood estimates of VEC model coefficients under the rank restrictions on *B*. Estimation information is returned in an optional fifth output argument, and can be displayed by setting an optional input parameter. For example, the following estimates a VEC(2) model of the data, and displays the results under each of the rank restrictions *r* = 0, *r* = 1, and *r* = 2:

load Data_Canada Y = Data(:,3:end); % Interest rate data [~,~,~,~,mles] = jcitest(Y,'model','H1','lags',2,... 'display','params');

**************************** Parameter Estimates (Test 1) r = 0 ------ B1 = -0.1848 0.5704 -0.3273 0.0305 0.3143 -0.3448 0.0964 0.1485 -0.1406 B2 = -0.6046 1.6615 -1.3922 -0.1729 0.4501 -0.4796 -0.1631 0.5759 -0.5231 c1 = 0.1420 0.1517 0.1508 r = 1 ------ A = -0.6259 -0.2261 -0.0222 B = 0.7081 1.6282 -2.4581 B1 = 0.0579 1.0824 -0.8718 0.1182 0.4993 -0.5415 0.1050 0.1667 -0.1600 B2 = -0.5462 2.2436 -1.7723 -0.1518 0.6605 -0.6169 -0.1610 0.5966 -0.5366 c0 = 2.2351 c1 = -0.0366 0.0872 0.1444 r = 2 ------ A = -0.6259 0.1379 -0.2261 -0.0480 -0.0222 0.0137 B = 0.7081 -2.4407 1.6282 6.2883 -2.4581 -3.5321 B1 = 0.2438 0.6395 -0.6729 0.0535 0.6533 -0.6107 0.1234 0.1228 -0.1403 B2 = -0.3857 1.7970 -1.4915 -0.2076 0.8158 -0.7146 -0.1451 0.5524 -0.5089 c0 = 2.0901 -3.0289 c1 = -0.0104 0.0137 0.1528

`mles`

is a tabular array of structure arrays, with each structure containing information for a particular test under a particular rank restriction. Since both tabular arrays and structure arrays use similar indexing, you can access the tabular array and then the structure using dot notation. For example, to access the rank 2 matrix of cointegrating relations:

B = mles.r2.paramVals.B

B = 0.7081 -2.4407 1.6282 6.2883 -2.4581 -3.5321

Comparing inferences and estimates from the Johansen and Engle-Granger approaches can be challenging, for a variety of reasons. First of all, the two methods are essentially different, and may disagree on inferences from the same data. The Engle-Granger two-step method for estimating the VEC model, first estimating the cointegrating relation and then estimating the remaining model coefficients, differs from Johansen's maximum likelihood approach. Secondly, the cointegrating relations estimated by the Engle-Granger approach may not correspond to the cointegrating relations estimated by the Johansen approach, especially in the presence of multiple cointegrating relations. It is important, in this context, to remember that cointegrating relations are not uniquely defined, but depend on the decomposition of the impact matrix.

Nevertheless, the two approaches should provide generally comparable results, if both begin with the same data and seek out the same underlying relationships. Properly normalized, cointegrating relations discovered by either method should reflect the mechanics of the data-generating process, and VEC models built from the relations should have comparable forecasting abilities.

As the following shows in the case of the Canadian interest rate data, Johansen's H1* model, which is the closest to the default settings of `egcitest`

, discovers the same cointegrating relation as the Engle-Granger test, assuming a cointegration rank of 2:

load Data_Canada Y = Data(:,3:end); % Interest rate data [~,~,~,~,reg] = egcitest(Y,'test','t2'); c0 = reg.coeff(1); b = reg.coeff(2:3); beta = [1; -b]; [~,~,~,~,mles] = jcitest(Y,'model','H1*'); BJ2 = mles.r2.paramVals.B; c0J2 = mles.r2.paramVals.c0; % Normalize the 2nd cointegrating relation with respect to % the 1st variable, to make it comparable to Engle-Granger: BJ2n = BJ2(:,2)/BJ2(1,2); c0J2n = c0J2(2)/BJ2(1,2); % Plot the normalized Johansen cointegrating relation together % with the original Engle-Granger cointegrating relation: h = gca; COrd = h.ColorOrder; plot(dates,Y*beta-c0,'LineWidth',2,'Color',COrd(4,:)) hold on plot(dates,Y*BJ2n+c0J2n,'--','LineWidth',2,'Color',COrd(5,:)) legend('Engle-Granger OLS','Johansen MLE','Location','NW') title('{\bf Cointegrating Relation}') axis tight grid on hold off

************************ Results Summary (Test 1) Data: Y Effective sample size: 40 Model: H1* Lags: 0 Statistic: trace Significance level: 0.05 r h stat cValue pValue eigVal ---------------------------------------- 0 1 38.8360 35.1929 0.0194 0.4159 1 0 17.3256 20.2619 0.1211 0.2881 2 0 3.7325 9.1644 0.5229 0.0891

A separate Econometrics Toolbox function, `jcontest`

, uses the Johansen framework to
test linear constraints on cointegrating relations *B* and
adjustment speeds *A*, and estimates VEC model parameters
under the additional constraints. Constraint testing allows you to
assess the validity of relationships suggested by economic theory.

Constraints imposed by `jcontest`

take one
of two forms. Constraints of the form *R*′*A* = 0 or *R*′*B* = 0 specify particular combinations of
the variables to be held fixed during testing and estimation. These
constraints are equivalent to parameterizations *A* = *H**φ* or *B* = *H**φ*,
where *H* is the orthogonal complement of *R* (in MATLAB^{®}, `null(R')`

)
and *φ* is a vector of free parameters. The
second constraint type specifies particular vectors in the column
space of *A* or *B*. The number
of constraints that `jcontest`

can impose is restricted
by the rank of the matrix being tested, which can be inferred by first
running `jcitest`

.

Tests on *B* answer questions about the space of cointegrating relations. The column vectors in *B*, estimated by `jcitest`

, do not uniquely define the cointegrating relations. Rather, they estimate a space of cointegrating relations, given by the span of the vectors. Tests on *B* allow you to determine if other potentially interesting relations lie in that space. When constructing constraints, interpret the rows and columns of the *n*-by- *r* matrix *B* as follows:

Row

*i*of*B*contains the coefficients of variable in each of the*r*cointegrating relations.Column

*j*of*B*contains the coefficients of each of the*n*variables in cointegrating relation*j*.

One application of `jcontest`

is to pretest variables for their order of integration. At the start of any cointegration analysis, trending variables are typically tested for the presence of a unit root. These pretests can be carried out with combinations of standard unit root and stationarity tests such as `adftest`

, `pptest`

, `kpsstest`

, or `lmctest`

. Alternatively, `jcontest`

lets you carry out stationarity testing within the Johansen framework. To do so, specify a cointegrating vector that is 1 at the variable of interest and 0 elsewhere, and then test to see if that vector is in the space of cointegrating relations. The following tests all of the variables in `Y`

a single call:

load Data_Canada Y = Data(:,3:end); % Interest rate data [h0,pValue0] = jcontest(Y,1,'BVec',{[1 0 0]',[0 1 0]',[0 0 1]'})

h0 = 1 1 1 pValue0 = 1.0e-03 * 0.3368 0.1758 0.1310

The second input argument specifies a cointegration rank of 1, and the third and fourth input arguments are a parameter/value pair specifying tests of specific vectors in the space of cointegrating relations. The results strongly reject the null of stationarity for each of the variables, returning very small *p*-values.

Another common test of the space of cointegrating vectors is to see if certain combinations of variables suggested by economic theory are stationary. For example, it may be of interest to see if interest rates are cointegrated with various measures of inflation (and, via the Fisher equation, if real interest rates are stationary). In addition to the interest rates already examined, `Data_Canada.mat`

contains two measures of inflation, based on the CPI and the GDP deflator, respectively. To demonstrate the test procedure (without any presumption of having identified an adequate model), we first run `jcitest`

to determine the rank of *B*, then test the stationarity of a simple spread between the CPI inflation rate and the short-term interest rate:

y1 = Data(:,1); % CPI-based inflation rate YI = [y1,Y]; % Test if inflation is cointegrated with interest rates: [h,pValue] = jcitest(YI); % Test if y1 - y2 is stationary: [hB,pValueB] = jcontest(YI,1,'BCon',[1 -1 0 0]')

************************ Results Summary (Test 1) Data: YI Effective sample size: 40 Model: H1 Lags: 0 Statistic: trace Significance level: 0.05 r h stat cValue pValue eigVal ---------------------------------------- 0 1 58.0038 47.8564 0.0045 0.5532 1 0 25.7783 29.7976 0.1359 0.3218 2 0 10.2434 15.4948 0.2932 0.1375 3 1 4.3263 3.8415 0.0376 0.1025 hB = 1 pValueB = 0.0242

The first test provides evidence of cointegration, and fails to reject a cointegration rank *r* = 1. The second test, assuming *r* = 1, rejects the hypothesized cointegrating relation. Of course, reliable economic inferences would need to include proper model selection, with corresponding settings for the `'model'`

and other default parameters.

Tests on *A* answer questions about common driving forces in the system. When constructing constraints, interpret the rows and columns of the *n*-by- *r* matrix *A* as follows:

Row

*i*of*A*contains the adjustment speeds of variable to disequilibrium in each of the*r*cointegrating relations.Column

*j*of*A*contains the adjustment speeds of each of the*n*variables to disequilibrium in cointegrating relation*j*.

For example, an all-zero row in *A* indicates a variable that is weakly exogenous with respect to the coefficients in *B*. Such a variable may affect other variables, but does not adjust to disequilibrium in the cointegrating relations. Similarly, a standard unit vector column in *A* indicates a variable that is exclusively adjusting to disequilibrium in a particular cointegrating relation.

To demonstrate, we test for weak exogeneity of the inflation rate with respect to interest rates:

load Data_Canada Y = Data(:,3:end); % Interest rate data y1 = Data(:,1); % CPI-based inflation rate YI = [y1,Y]; [hA,pValueA] = jcontest(YI,1,'ACon',[1 0 0 0]')

hA = 0 pValueA = 0.3206

The test fails to reject the null hypothesis. Again, the test is conducted with default settings. Proper economic inference would require a more careful analysis of model and rank specifications.

Constrained parameter estimates are accessed via a fifth output argument from `jcontest`

. For example, the constrained, rank 1 estimate of *A* is obtained by referencing the fifth output with dot (`.`

) indexing:

```
[~,~,~,~,mles] = jcontest(YI,1,'ACon',[1 0 0 0]');
Acon = mles.paramVals.A
```

Acon = 0 0.1423 0.0865 0.2862

The first row of *A* is 0, as specified by the constraint.

Was this topic helpful?