Main Content

Represent Univariate Dynamic Conditional Mean Models in MATLAB

This topic gives an overview on how to represent a univariate, dynamic, linear conditional mean model, such as an autoregressive, integrated, moving average (ARIMA) model, in MATLAB® by using the Econometrics Toolbox™ arima object framework. Specifically, the topic shows how to create nonseasonal and seasonal ARIMA models by using programmatic and interactive workflows; the former workflow uses the arima function and the latter workflow uses the Econometric Modeler app.

Before creating an ARIMA model for time series data, you must determine a model or set of models for your data. For details, see Programmatically Select ARIMA Model for Time Series Using Box-Jenkins Methodology.

Model Creation Overview

You create a univariate, dynamic, linear conditional mean model for a response series yt at the command line by using the arima function. The arima function creates an arima model object, which is a variable that encapsulates the functional form of the conditional mean model of interest. An arima object created this way exists independently of data, in other words, you do not need to fit a model to data to create one. In contrast, the Econometric Modeler requires that you fit a model to data when you create one.

Regardless of how you create an arima object, it has these characteristics:

  • It stores your specifications of the model, such as the model structure and parameter values, in properties of the object.

  • It facilitates model operations, for example, at the command line, you can pass an arima object, specifying the model structure and containing unknown parameters, and data to the estimate function to estimate the unknown parameters.

To illustrate the contents of an arima object, create a default ARIMA model by calling arima without inputs.

Mdl = arima
Mdl = 

  arima with properties:

     Description: "ARIMA(0,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 0
               D: 0
               Q: 0
        Constant: NaN
              AR: {}
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

MATLAB returns a standard object display, with properties listed to the left and their corresponding values to the right. By default, arima sets, to 0, required structural properties, which correspond to the degrees of the lag operator polynomials in the model. It sets the sole coefficient and model variance to NaN. The default ARIMA model has no dynamic responses (lag operator polynomials have degree 0); it is a simple mean model yt = c + εt, where the values of c (Constant property) and the variance of the innovations (Variance property) are in the model but their values are unknown.

The default arima object is simplistic; you can create richer ARIMA models by using one of these two syntaxes or Econometric Modeler (see Specify Model Using Econometric Modeler App):

  • arima(p,D,q)—This shorthand syntax quickly creates a nonseasonal ARIMA(p,D,q) model, prepared for estimation, with a degree p nonseasonal AR lag polynomial, D degrees of nonseasonal integration, and a degree q nonseasonal MA lag polynomial. For example, arima(4,0,1) creates an ARMA(4,1) model containing unknown coefficients and innovations variance. For more details, see Create ARIMA(p,D,q) Model Using Shorthand Syntax.

  • arima(Property1=Value1,Property2=Value2,...)—This longhand syntax enables you to create more complex models, such as seasonal ARIMA models (SARIMA), and specify values for parameters, by using name-value argument syntax. For example, arima(ARLags=1:4,MALags=1) also creates an ARMA(4,1) model containing unknown coefficients and innovations variance. For more details, see Create ARIMA(p,D,q) Model Using Longhand Syntax and Create Seasonal ARIMA (SARIMA) Models Using Name-Value Arguments.

ARIMA Model Parameters and Corresponding Object Properties

Regardless of how you plan to create a model, observe that model parameters correspond to object properties. Therefore, before you represent a model in MATLAB, it is integral to understand the model equations, their formats, and their parameters. This table contains the general equations of a univariate, dynamic, linear conditional mean model, in lag operator polynomial and difference-equation notation. In the equations, yt is the response series, xt is a collection of exogenous series, and εt is a random series of innovations.

NotationEquation
Lag operator polynomial

The general equation is

a(L)yt=c+xtβ+b(L)εt.

The compound autoregressive polynomial a(L) and the compound moving average (MA) polynomial b(L) are often expressed in their expanded form, as polynomial factors for nonseasonal and seasonal effects and integration:

ϕ(L)(1L)DΦ(L)(1Ls)Dsyt=c+xtβ+θ(L)Θ(L)εt.

Refer to this equation when building a model in MATLAB.

Difference equation

yt=c+xtβ+a1yt1++awytw+εt+b1εt1++bvεtv.

This equation results from expanding the lag operator polynomial equation, and then solving for yt. The equation demonstrates the dynamic nature of the system more clearly than the lag operator polynomial equation.

Model parameters have these characteristics:

  • Most model parameters map exactly to object properties, for example, arima stores the autoregressive (AR) coefficient of an AR(1) model in the AR property and it stores the degree of nonseasonal integration D in the D property.

  • Parameters are non-estimable or estimable:

    • Non-estimable parameters cannot be fit to data; they typically control the structure of the model. Although the software sets default values to the corresponding properties in some cases, typically, you should set them appropriately for your problem, either explicitly or indirectly such that the software can infer the appropriate values. For example, in the default model, arima set properties P, D, and Q to 0; the corresponding parameters are not estimable.

    • Estimable parameters, which include model coefficients and the innovations variance, can be fit to data. Parameters configured for estimation have corresponding property values set to NaN. For example, in the default model, arima sets Constant and Variance to NaN; this setting prepares the corresponding parameters, c and σ2, respectively, for estimation.

This table describes the model parameters and corresponding properties.

Model Component/ParameterDescriptionarima PropertyEstimable?
ϕ(L)

ϕ(L)=1ϕLϕ2L2...ϕpLp, a p-degree stable nonseasonal AR polynomial.

AR stores the coefficients {ϕ1,ϕ2,…,ϕp}; indices correspond to lag exponents, which are customizable by setting the ARLags name-value argument.

{ϕ1,ϕ2,…,ϕp} are estimable.

pNonseasonal AR polynomial degreep does not map to a property, but it contributes to P (see notes below). You can specify p using shorthand syntax or arima infers it from other specifications (ARLags or AR).No
DDegree of nonseasonal integrationDNo
Φ(L)

Φ(L)=1Φp1Lp1Φp2Lp2...ΦpsLps, a ps-degree stable, multiplicative seasonal AR polynomial.

SAR stores the coefficients {Φp1p2,…,Φps}; indices correspond to lag exponents, which are customizable by setting the SARLags name-value argument.

p1p2,…,Φps} are estimable.

psSeasonal AR polynomial degreeps does not map to a property, but it contributes to P (see notes below). arima infers ps from other specifications (SARLags or SAR).No
sSeasonality, or the degree of the seasonal differencing polynomial

Seasonality

No
DsDegree of seasonal integration

No corresponding property, but:

  • If Seasonality > 0, Ds = 1.

  • Otherwise, Ds = 0.

Not applicable
cModel constantConstantYes
βRegression coefficient of exogenous covariatesBetaYes
θ(L)

θ(L)=1+θL+θ2L2+...+θqLq, a q-degree invertible nonseasonal MA polynomial.

MA stores the coefficients {θ1,θ2,…,θq}; indices correspond to lag exponents, which are customizable by setting the MALags name-value argument..

{θ1,θ2,…,θq} are estimable.

qNonseasonal MA polynomial degreeq does not map to a property, but it contributes to Q (see notes below). You can specify q using shorthand syntax or arima infers it from other specifications (MALags or MA).No
Θ(L)

Θ(L)=1+Θq1Lq1+Θq2Lq2+...+ΘqsLqs, a qs-degree invertible, multiplicative seasonal MA polynomial.

SMA stores the coefficients {Θq1q2,…,Θqs}; indices correspond to lag exponents, which are customizable by setting the SMALags name-value argument..

q1q2,…,Θqs} are estimable.

qsSeasonal MA polynomial degreeqs does not map to a property, but it contributes to Q (see notes below). arima infers qs from other specifications (SMALags or SMA).No
εtSeries of random iid Gaussian or Student's t innovations, with mean 0 and variance σ2Distribution stores the distribution name and any distribution-specific parameters (e.g., degrees of freedom ν). Variance stores the value of σ2.σ2 is estimable. The distribution is not estimable, but ν is estimable when the distribution is Student's t.

Each of the model parameters p, q, ps, and qs does not have a one-to-one mapping to a property. arima combines the polynomial degree parameters and stores them in properties as follows:

  • The model property P = p + D + ps + s; it is the degree w of the compound AR polynomial a(L). P specifies the lag of past (presample) observations required to initialize the model.

  • The model property Q = q + qs; it is the degree v of the compound MA polynomial b(L). Q specifies the lag of past innovations or conditional variances required to initialize the model.

Model Creation for Calibration Versus Estimation

Programmatic workflows that use the arima function are flexible because they give you full control over the model structure and parameter values. Specifically, you can create the following types of models:

  • Fully specified model — You assign numeric values to all parameters, regardless of whether they are estimable, by using name-value argument syntax when you call arima or dot notation after you create a model. A fully specified model is useful when you want to calibrate an entire model, that is, fix all parameters at specific values. You can execute any operation on fully specified models except for estimation (for example, forecast responses from a calibrated model by passing the model to the forecast function).

  • Partially specified model — At least one estimable parameter is unknown and you plan to fit the model to data. You can think of a partially specified model as a template for estimation; NaN-valued parameters indicate to the estimate function which parameters to fit to data (estimate accepts a partially specified model and data, and then returns a fully specified model). The only operation you can perform on a partially specified model is estimation.

Interactive workflows in Econometric Modeler create only partially specified models for estimation. The app directs you through model creation by using dialogs for model types, for example

ARIMA(1,1,1) Model Parameters dialog in Econometric Modeler

To illustrate the differences between partially and fully specified models, at the command line, create the partially specified AR(1) model yt = 1 + ϕyt–1 + εt.

Mdl = arima(Constant=1,ARLags=1) % Partially specified model
Mdl = 

  arima with properties:

     Description: "ARIMA(1,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 1
               D: 0
               Q: 0
        Constant: 1
              AR: {NaN} at lag [1]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

Because AR contains a NaN and Variance is NaN, the corresponding parameters ϕ and σ2 are unknown and estimable.

Set the unknown parameters ϕ to 0.5 and σ2 to 0.25 by using dot notation. Specify the AR property as a cell vector of numeric scalars; each cell corresponds to an AR coefficient in the model.

Mdl.AR = {0.5};
Mdl.Variance = 0.25
Mdl = 

  arima with properties:

     Description: "ARIMA(1,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 1
               D: 0
               Q: 0
        Constant: 1
              AR: {0.5} at lag [1]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.25

No properties contain NaN values. Therefore, Mdl is fully specified.

Create ARIMA(p,D,q) Model Using Shorthand Syntax

The ARIMA(p,D,q) model is a nonseasonal model of the form:

ϕ(L)(1L)Dyt=c+θ(L)εt

At the command line, you can specify a model of this form using the shorthand syntax arima(p,D,q). For the input arguments p, D, and q, enter the number of consecutive nonseasonal AR terms from lag 1 (p), the order of nonseasonal integration (D), and the number of consecutive nonseasonal MA terms from lag 1 (q), respectively.

When you use this shorthand syntax, arima creates an arima model with these default settings.

Property NameValueDescription
PSum of the input numeric scalars p and DThe number of presample responses required to initialize the model (the order of the nonseasonal AR polynomial p plus the level of nonseasonal integration D)
DInput numeric scalar DDegree of nonseasonal integration, D
QInput numeric scalar qThe number of presample innovations required to initialize the model (the order of nonseasonal MA polynomial q)
ConstantNaNModel constant c is present and estimable
ARLength p cell vector of NaNsAll nonseasonal AR coefficients ϕj, j = 1 through p, are present and estimable
SAREmpty cell vector {}Seasonal AR polynomial Φ(L) is not present
MALength q cell vector of NaNsAll MA coefficients θk, k = 1 through q, are present and estimable
SMAEmpty cell vector {}Seasonal MA polynomial Θ(L) is not present
Seasonality0No seasonal integration s
BetaEmpty vector []No exogenous regression component xtβ is present in the model
Distribution"Gaussian"Innovations εt are Gaussian
VarianceNaNModel variance σ2 is present and estimable

For nonseasonal models, the inputs D and q are the values arima assigns to properties D and Q. However, the input argument p is not necessarily the value arima assigns to the model property P. P stores the number of presample observations needed to initialize the AR component of the model. The required number of presample observations is p + D.

Consider specifying the ARIMA(2,1,1) model

(1ϕ1Lϕ2L2)(1L)1yt=c+(1+θ1L)εt,

where the innovation process is Gaussian with (unknown) constant variance.

Mdl = arima(2,1,1)
Mdl = 

  arima with properties:

     Description: "ARIMA(2,1,1) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 3
               D: 1
               Q: 1
        Constant: NaN
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {NaN} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

The model property P does not have value 2 (the AR degree). With the integration, a total of p + D (here, 2 + 1 = 3) presample observations are needed to initialize the AR component of the model.

The partially specified model Mdl has NaNs for all estimable parameters (coefficients and innovations variance). A NaN value is a placeholder signaling that the corresponding parameter is prepared for estimation.

To set different values for any properties, you can modify the created model object using dot notation. For example, set the variance of Mdl to 0.5.

Mdl.Variance = 0.5
Mdl = 

  arima with properties:

     Description: "ARIMA(2,1,1) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 3
               D: 1
               Q: 1
        Constant: NaN
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {NaN} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.5

To estimate parameters, pass the model and data to estimate. This returns a fitted, fully specified arima object with the same structure as the input model. The fitted model contains parameter estimates for each NaN valued property.

Create ARIMA(p,D,q) Model Using Longhand Syntax

The most flexible way to create a conditional mean model is by using the longhand syntax, or, setting property values using name-value arguments. You do not need, nor are you able, to specify a value for every model object property. arima assigns default values to any properties you do not (or cannot) specify.

In condensed, lag operator notation, nonseasonal ARIMA(p,D,q) models are of the form

ϕ(L)(1L)Dyt=c+θ(L)εt.(1)

The innovations εt can take one of the following forms, which you set using the Distribution name-value argument:

  • Independent and identically distributed Gaussian distribution with mean 0 and constant variance σ2 (the default)

  • Independent and identically distributed central Student’s t with ν degrees of freedom

  • Dependent Gaussian or Student’s t with a conditional variance process σt2 (for example, a GARCH model). Specify the conditional variance model specifying a garch, egarch, or gjr model.

You can specify the following name-value arguments to create nonseasonal arima models.

Name-Value Arguments for Nonseasonal ARIMA Models

NameCorresponding Model Term(s) in Equation 1When to Specify
ARNonseasonal AR coefficients, ϕ1,,ϕp

Calibrate, or specify equality constraints during estimation for, the AR coefficients. For example, to specify the AR coefficients in the model

yt=0.8yt10.2yt2+εt,

specify AR={0.8 -0.2}.

You need only specify the nonzero elements of AR. If the nonzero coefficients are at nonconsecutive lags, specify the corresponding lags using ARLags.

Any coefficients you specify must correspond to a stable AR operator polynomial.

ARLagsLags corresponding to nonzero, nonseasonal AR coefficients

ARLags is not a model property.

Use this argument to customize the nonzero lags of the MA coefficients. For example, to specify nonzero AR coefficients at lags 1 and 12, e.g., yt=ϕ1yt1+ϕ12yt12+εt,specify ARLags=[1 12].

Set AR and ARLags together to specify known nonzero AR coefficients at nonconsecutive lags. For example, if in the given AR(12) model ϕ1=0.6 and ϕ12=0.3, specify AR={0.6 -0.3},ARLags=[1 12].

BetaValues of the coefficients of the exogenous covariates

Calibrate, or specify equality constraints during estimation for, the exogenous regression coefficients. For example, use Beta=[0.5 7 -2] to specify β=[0.572].

By default, Beta is an empty vector (model has no regression component).

ConstantConstant term, cCalibrate, or specify equality constraints during estimation for, the model constant. For example, for a model with no constant term, specify Constant=0.
By default, Constant has value NaN.
DDegree of nonseasonal differencing, DSpecify a degree of nonseasonal differencing greater than zero. For example, to specify one degree of differencing, specify D=1.
By default, D has value 0 (meaning no nonseasonal integration).
DistributionDistribution of the innovation processSpecify a Student’s t innovation distribution. By default, the innovation distribution is Gaussian.
For example, to specify a t distribution with unknown degrees of freedom, specify Distribution="t".
To specify a t innovation distribution with known degrees of freedom, assign Distribution a structure array with fields Name and DoF. For example, for a t distribution with nine degrees of freedom, specify Distribution=struct("Name","t","DoF",9).
MANonseasonal MA coefficients, θ1,,θq

Calibrate, or specify equality constraints during estimation for, the MA coefficients. For example, to specify the MA coefficients in the model

yt=εt+0.5εt1+0.2εt2,

specify MA={0.5 0.2}.

You need only specify the nonzero elements of MA. If the nonzero coefficients are at nonconsecutive lags, specify the corresponding lags using MALags.

Any coefficients you specify must correspond to an invertible MA polynomial.

MALagsLags corresponding to nonzero, nonseasonal MA coefficients

MALags is not a model property.

Use this argument to customize the nonzero lags of the MA coefficients. For example, to specify nonzero MA coefficients at lags 1 and 4, e.g.,

yt=εt+θ1εt1+θ4εt4,

specify MALags=[1 4].

Set MA and MALags together to specify known nonzero MA coefficients at nonconsecutive lags. For example, if in the given MA(4) model θ1=0.5 and θ4=0.2, specify MA={0.4 0.2},MALags=[1 4].

Variance
  • Scalar variance of the innovation process, σε2

  • Conditional variance process, σt2

  • Calibrate, or specify equality constraints during estimation for, the innovations variance. For example, for a model with known variance 0.1, specify Variance=0.1. By default, Variance has value NaN.

  • To specify a conditional variance model, σt2. Set Variance to a conditional variance model object, e.g., Variance=garch(1,1). For more details, see garch.

This table contains examples of nonseasonal models and how to specify them by using arima.

ModelSpecification

AR(1), all coefficients are unknown

  • yt=c+ϕ1yt1+εt

  • εt=σzt

  • zt is an iid standard Gaussian series

arima(AR={NaN}), arima(ARLags=1), or arima(1,0,0)

ARMA(8,4), all coefficients are known

  • yt=c+ϕ4yt4+ϕ8yt8+εt+θ4εt4

  • εt=σzt

  • zt is an iid standard Gaussian series

arima(ARLags=[4 8],MALags=4)

MA(2), all coefficients are unknown

  • yt=εt+θ1εt1+θ2εt2

  • εt=σzt

  • zt is an iid Student’s t series with unknown degrees of freedom

arima(Constant=0,MA={NaN NaN},Distribution="t") or arima(Constant=0,MALags=[1 2],Distribution="t")

Fully specified ARIMA(1,1,1)

  • (10.8L)(1L)yt=0.2+(1+0.6L)εt

  • εt=0.1zt

  • zt is an iid Student’s t series with eight degrees of freedom

arima(Constant=0.2,AR={0.8},MA={0.6},D=1,Variance=0.1^2, ...
Distribution=struct("Name","t","DoF",8))

Create Seasonal ARIMA (SARIMA) Models Using Name-Value Arguments

For a time series with periodicity s, ps is the degree of the seasonal AR lag operator polynomial Φ(L)=(1Φ1Lp1ΦpsLps) and qs is the degree of the seasonal MA lag operator polynomial Θ(L)=(1+Θ1Lq1++ΘqsLqs). A multiplicative seasonal ARIMA model (SARIMA) with degree D nonseasonal integration, degree s seasonality, and one degree of seasonal integration, is

ϕ(L)Φ(L)(1L)D(1Ls)yt=c+θ(L)Θ(L)εt.(2)

In addition to the arguments for specifying nonseasonal models (described in Name-Value Arguments for Nonseasonal ARIMA Models), you can specify these name-value arguments to create a SARIMA model.

Name-Value Arguments for Seasonal ARIMA Models

ArgumentCorresponding Model Term(s) in Equation 2When to Specify
SARSeasonal AR coefficients, Φ1,,Φps

Calibrate, or specify equality constraints during estimation for, the seasonal AR coefficients. When specifying AR coefficients, use the sign opposite to what appears in Equation 2 (that is, use the sign of the coefficient as it would appear on the right side of the equation).

Use SARLags to specify the lags of the nonzero seasonal AR coefficients. Specify the lags associated with the seasonal polynomials in the periodicity of the observed data (e.g., 4, 8,... for quarterly data, or 12, 24,... for monthly data), and not as multiples of the seasonality (e.g., 1, 2,...).

For example, to specify the model

(10.8L)(10.2L12)yt=εt,

specify AR=0.8,SAR=0.2,SARLags=12.

Any coefficient values you enter must correspond to a stable seasonal AR polynomial.

SARLagsLags corresponding to nonzero seasonal AR coefficients, in the periodicity of the observed series

SARLags is not a model property.

Use this argument to customize the nonzero lags of the SAR coefficients.

For example, to specify the model

(1ϕL)(1Φ12L12)yt=εt,

specify ARLags=1,SARLags=12.

SMASeasonal MA coefficients, Θ1,,Θqs

Calibrate, or specify equality constraints during estimation for, the seasonal MA coefficients.

Use SMALags to specify the lags of the nonzero seasonal MA coefficients. Specify the lags associated with the seasonal polynomials in the periodicity of the observed data (e.g., 4, 8,... for quarterly data, or 12, 24,... for monthly data), and not as multiples of the seasonality (e.g., 1, 2,...).

For example, to specify the model

yt=(1+0.6L)(1+0.2L12)εt,

specify MA=0.6,SMA=0.2,SMALags=12.

Any coefficient values you enter must correspond to an invertible seasonal MA polynomial.

SMALagsLags corresponding to the nonzero seasonal MA coefficients, in the periodicity of the observed series

SMALags is not a model property.

Use this argument to customize the nonzero lags of the SMA coefficients.

For example, to specify the model

yt=(1+θ1L)(1+Θ4L4)εt,

specify MALags=1,SMALags=4.

SeasonalitySeasonal periodicity, sSpecify the degree of seasonal integration s in the seasonal differencing polynomial Δs = 1 – Ls. For example, to specify the periodicity for seasonal integration of monthly data, specify Seasonality=12.
If you specify nonzero Seasonality, then the degree of the whole seasonal differencing polynomial is one. By default, Seasonality has value 0 (meaning periodicity and no seasonal integration).

This table contains examples of seasonal models and how to specify them by using arima.

ModelSpecification
  • yt=c+ϕ1yt1+εt

  • εt=σzt

  • zt is an iid standard Gaussian series

arima(AR={NaN}), arima(ARLags=1), or arima(1,0,0)

  • yt=c+ϕ4yt4+ϕ8yt8+εt+θ4εt4

  • εt=σzt

  • zt is an iid standard Gaussian series

arima(ARLags=[4 8],MALags=4)
  • yt=εt+θ1εt1+θ2εt2

  • εt=σzt

  • zt is an iid Student’s t series with unknown degrees of freedom

arima(Constant=0,MA={NaN NaN},Distribution="t") or arima(Constant=0,MALags=[1 2],Distribution="t")
  • (10.8L)(1L)yt=0.2+(1+0.6L)εt

  • εt=0.1zt

  • zt is an iid Student’s t series with eight degrees of freedom

arima(Constant=0.2,AR={0.8},MA={0.6},D=1,Variance=0.1^2, ...
Distribution=struct("Name","t","DoF",8))

Create Conditional Mean Model Containing Exogenous Linear Regression Term

You can include an exogenous linear regression term in any univariate conditional mean model. For simplicity, consider including an exogenous linear regression terms in an ARIMA model. The result is an ARIMAX(p,D,q) model and it has the form

ϕ(L)yt=c+xtβ+θ(L)εt,(3)
where c* = c/(1–L)D and θ*(L) = θ(L)/(1–L)D. Like the ARIMA model, the ARIMAX model can be extended to include seasonal terms (SARIMAX).

In addition to the properties associated with nonseasonal and seasonal parameters, ARIMAX models contain the property Beta.

Name-Value Arguments for ARIMAX Models

ArgumentCorresponding Model Term(s) in Equation 2When to Specify
BetaExogenous linear regression coefficient, β

Calibrate, or specify equality constraints during estimation for, β. For example, use Beta=[0.5 7 -2] to specify β=[0.572].

By default, Beta is an empty vector (model has no regression component).

The following conditions apply to models with an exogenous regression component:

  • Econometrics Toolbox model objects are agnostic of data; models are aware of all associated series, response and exogenous predictors, after you create a model and when you operate on it using functions that require input data.

  • If you plan to estimate β, do not specify Beta when you call arima. arima infers the number of regression coefficients from the number of variables in the input exogenous predictor data.

  • If you specify a nonzero D, the software differences the response series yt before the predictors enter the model. You should preprocess the exogenous covariates xt by testing for stationarity and differencing if any are unit root nonstationary.

  • If any nonstationary exogenous predictor enters the model, the false negative rate for significance tests of β can increase.

This table contains examples of models containing an exogenous regression component.

ModelSpecification

ARX(1), all coefficients are unknown, one exogenous predictor xt

  • yt=c+βxt+ϕ1yt1+εt

  • εt=σzt

  • zt is an iid standard Gaussian series

All the following lines specify this model:

  • arima(AR={NaN})

  • arima(ARLags=1)

  • arima(1,0,0)

  • arima(ARLags=1,Beta=NaN)

Fully specified ARIMAX(1,1,1), 2 exogenous predictors x1,t and x2,t

  • (10.8L)(1L)yt=0.2+[x1,tx2,t][54](1+0.6L)εt

  • εt=0.1zt

  • zt is an iid Student’s t series with eight degrees of freedom

arima(Constant=0.2,AR={0.8},MA={0.6},D=1,Variance=0.1^2, ...
Distribution=struct("Name","t","DoF",8),Beta=[5; 4])

Specify Model Using Econometric Modeler App

You can specify the lag structure and innovation distribution of seasonal and nonseasonal conditional mean models using the Econometric Modeler app. The app treats all coefficients as unknown and estimable, including the degrees of freedom parameter for a t innovation distribution. Econometric Modeler

At the command line, open the Econometric Modeler app.

econometricModeler

Alternatively, open the app from the apps gallery (see Econometric Modeler).

In the app, you can see all supported models by selecting a time series variable for the response in the Time Series pane. Then, on the Modeler tab, in the Models section, click the arrow to display the models gallery.

Models gallery with subsections for ARIMA MODELS, GARCH MODELS, and REGRESSION MODELS.

The ARIMA Models section contains supported conditional mean models.

For conditional mean model estimation, SARIMA and SARIMAX are the most flexible models. You can create any conditional mean model that excludes exogenous predictors by clicking SARIMA, or you can create any conditional mean model that includes at least one exogenous predictor by clicking SARIMAX.

After you select a model, the app displays the Type Model Parameters dialog box, where Type is the model type. This figure shows the SARIMAX Model Parameters dialog box.

SARIMAX Model Parameters window is open with Lag Order tab selected and the Include Constant Term check box selected.

Adjustable parameters in the dialog box depend on Type. In general, adjustable parameters include:

  • A model constant and linear regression coefficients corresponding to predictor variables

  • Time series component parameters, which include seasonal and nonseasonal lags and degrees of integration

  • The innovation distribution

As you adjust parameter values, the equation in the Model Equation section changes to match your specifications. Adjustable parameters correspond to input and name-value arguments as described in Create ARIMA(p,D,q) Model Using Longhand Syntax, Create Seasonal ARIMA (SARIMA) Models Using Name-Value Arguments, and arima.

Econometric Modeler always creates estimated models. After you specify the model structure, select Estimate to fit the model. You can inspect the resulting arima model in the Preview pane, for example

Example of an estimated ARIMA model object display in the Preview pane

For more details on specifying models using the app, see Fit Models to Data and Specifying Univariate Lag Operator Polynomials Interactively.

What Are Conditional Mean Models?

Consider the univariate random variable yt. The unconditional mean of yt is the expected value of yt, E(yt). In contrast, the conditional mean of yt is the expected value of yt given a conditioning set of variables, Ωt.

A conditional mean model specifies a functional form for E(yt| Ωt).

For a static conditional mean model, the conditioning set of variables is measured contemporaneously with the dependent variable yt. An example of a static conditional mean model is the ordinary linear regression model. Suppose the conditioning set Ωt = xt, a row vector of exogenous covariates measured at time t, and β, a column vector of coefficients. The conditional mean of yt is the linear combination

E(yt|xt)=xtβ.

In time series econometrics, often the dynamic behavior of a variable over time is of interest. A dynamic conditional mean model is a stochastic process that specifies the expected value of yt as a function of historical information. Let Ωt = Ht–1 denote the history of the process available at time t. A dynamic conditional mean model specifies the evolution of the conditional mean, E(yt| Ht–1). Examples of historical information are:

  • Past observations, y1, y2,...,yt–1

  • Vectors of past exogenous variables, x1,x2,…,xt–1.

  • Past innovations, ε1,ε2,…,εt–1.

Stationary processes play an important role in time series analysis. A stochastic process is covariance stationary when its expected value, variance, and covariance between elements of the series are independent of time.

For example, the MA(q) model, with c = 0, is stationary for any nonnegative integer q < ∞ because each of the following are free of t for all time points [1].

  • E(yt)=θ(L)0=0.

  • Var(yt)=σ2i=1qθi2.

  • Cov(yt,yts)={σ2(θs+θ1θs1+θ2θs2+...+θqθsq) if sq0 otherwise.

A time series is a unit root process when its expected value, variance, or covariance grows with time. Consequently, the time series is nonstationary.

The constant mean property of stationarity does not preclude the possibility of a dynamic conditional expectation process. The serial autocorrelation between lagged observations exhibited by many time series suggests the expected value of yt depends on historical information. By Wold’s decomposition [2], the conditional mean of any stationary process yt is

E(yt|Ht1)=μ+i=1ψiεti,(4)
where {εt} is an uncorrelated innovation process with mean zero and the coefficients ψt are absolutely summable. E(yt) = μ is the constant unconditional mean of the stationary process.

Any model of the general linear form given by Equation 4 is a valid specification for the dynamic behavior of a stationary stochastic process. Special cases of stationary stochastic processes are the autoregressive (AR) model, moving average (MA) model, and the autoregressive moving average (ARMA) model.

References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Wold, Herman. "A Study in the Analysis of Stationary Time Series." Journal of the Institute of Actuaries 70 (March 1939): 113–115. https://doi.org/10.1017/S0020268100011574.

See Also

Apps

Objects

Functions

Topics