Regression Models with Time Series Errors

What Are Regression Models with Time Series Errors?

Regression models with time series errors attempt to explain the mean behavior of a response series (y_t, t = 1,...,T) by accounting for linear effects of predictors (X_t) using a multiple linear regression (MLR). However, the errors (u_t), called unconditional disturbances, are time series rather than white noise, which is a departure from the linear model assumptions. Unlike the ARIMA model that includes exogenous predictors, regression models with time series errors preserve the sensitivity interpretation of the regression coefficients (β) [2].

These models are particularly useful for econometric data. Use these models to:

Analyze the effects of a new policy on a market indicator (an intervention model).
Forecast population size adjusting for predictor effects, such as expected prevalence of a disease.
Study the behavior of a process adjusting for calendar effects. For example, you can analyze traffic volume by adjusting for the effects of major holidays. For details, see [3].
Estimate the trend by including time (t) in the model.
Forecast total energy consumption accounting for current and past prices of oil and electricity (distributed lag model).

Use these tools in Econometrics Toolbox™ to:

Specify a regression model with ARIMA errors (see regARIMA).
Estimate parameters using a specified model, and response and predictor data (see estimate).
Simulate responses using a model and predictor data (see simulate).
Forecast responses using a model and future predictor data (see forecast).
Infer residuals and estimated unconditional disturbances from a model using the model and predictor data (see infer).
filter innovations through a model using the model and predictor data
Generate impulse responses (see impulse).
Compare a regression model with ARIMA errors to an ARIMAX model (see arima).

Conventions

A regression model with time series errors has the following form (in lag operator notation):

\begin{matrix} y_{t} = c + X_{t} β + u_{t} \\ a (L) A (L) {(1 - L)}^{D} (1 - L^{s}) u_{t} = b (L) B (L) ε_{t}, \end{matrix}

(1)

where

t = 1,...,T.
y_t is the response series.
X_t is row t of X, which is the matrix of concatenated predictor data vectors. That is, X_t is observation t of each predictor series.
c is the regression model intercept.
β is the regression coefficient.
u_t is the disturbance series.
ε_t is the innovations series.
$L^{j} y_{t} = y_{t - j} .$
$a (L) = (1 - a_{1} L - ... - a_{p} L^{p}),$ which is the degree p, nonseasonal autoregressive polynomial.
$A (L) = (1 - A_{1} L - ... - A_{p_{s}} L^{p_{s}}),$ which is the degree p_s, seasonal autoregressive polynomial.
${(1 - L)}^{D},$ which is the degree D, nonseasonal integration polynomial.
$(1 - L^{s}),$ which is the degree s, seasonal integration polynomial.
$b (L) = (1 + b_{1} L + ... + b_{q} L^{q}),$ which is the degree q, nonseasonal moving average polynomial.
$B (L) = (1 + B_{1} L + ... + B_{q_{s}} L^{q_{s}}),$ which is the degree q_s, seasonal moving average polynomial.

Following Box and Jenkins methodology, u_t is a stationary or unit root nonstationary, regular, linear time series. However, if u_t is unit root nonstationary, then you do not have to explicitly difference the series as they recommend in [1]. You can simply specify the seasonal and nonseasonal integration degree using the software. For details, see Create Regression Models with ARIMA Errors.

Another deviation from the Box and Jenkins methodology is that u_t does not have a constant term (conditional mean), and therefore its unconditional mean is 0. However, the regression model contains an intercept term, c.

Note

If the unconditional disturbance process is nonstationary (i.e., the nonseasonal or seasonal integration degree is greater than 0), then the regression intercept, c, is not identifiable. For details, see Intercept Identifiability in Regression Models with ARIMA Errors.

The software enforces stability and invertibility of the ARMA process. That is,

$ψ (L) = \frac{b (L) B (L)}{a (L) A (L)} = 1 + ψ_{1} L + ψ_{2} L^{2} + ...,$

where the series {ψ_t} must be absolutely summable. The conditions for {ψ_t} to be absolutely summable are:

a(L) and A(L) are stable (i.e., the eigenvalues of a(L) = 0 and A(L) = 0 lie inside the unit circle).
b(L) and B(L) are invertible (i.e., their eigenvalues lie of b(L) = 0 and B(L) = 0 inside the unit circle).

The software uses maximum likelihood for parameter estimation. You can choose either a Gaussian or Student’s t distribution for the innovations, ε_t.

The software treats predictors as nonstochastic variables for estimation and inference.

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Hyndman, R. J. (2010, October). “The ARIMAX Model Muddle.” Rob J. Hyndman. Retrieved May 4, 2017 from https://robjhyndman.com/hyndsight/arimax/.

[3] Ruey, T. S. “Regression Models with Time Series Errors.” Journal of the American Statistical Association. Vol. 79, Number 385, March 1984, pp. 118–124.