Monday, May 8, 2017

AutoRegressive Distributed Lag (ARDL) Estimation. Part 2 - Inference

This is the second part of our AutoRegressive Distributed Lag (ARDL) post. For Part 1, please go here, and for Part 3, please visit here.

In this post we outline the correct theoretical underpinning of the inference behind the Bounds test for cointegration in an ARDL model. Whilst the discussion is by its nature quite technical, it is important that practitioners of the Bounds test have a grasp of the background behind its inferences.

Overview

While the ARDL approach to cointegration is typically considered synonymous with the Pesaran, Shin, and Smith (2001) Bounds test for cointegration, in this post we emphasize that correct inference is in fact rooted in cointegration theory. In Part 1 of this series, we mentioned that the ARDL framework is a one-to-one reparameterization of the conditional error correction model (ECM) representation of the underlying vector auto-regression (VAR).

Recall that a VAR is a natural extension of the univariate autoregressive model to multivariate series, and is often interpreted as an autoregressive system-of-equations regression model with multiple endogenous variables. As such, it lends itself to the analysis of simultaneous interactions between variables -- namely, their short-run dynamics, but more importantly, their long-run (equilibrating) or cointegrating behaviour. In this regard, the vector error correction model (VECM), which is a reparameterization of the VAR to isolate the equilibrating relationships, if they exist, is of central importance. Nevertheless, like the VAR, the VECM models simultaneous interactions among several endogenous variables. However, applications in Economics typically ask:

How does one variable in the VAR behave conditional on a all the others, which are themselves endogenously determined, and is their any cointegrating relationship among them?

In other words, we hope to derive a conditional ECM (CECM), which formalizes an ECM model for some variable conditional on all the others, but at the same time, isolates the cointegrating relationship among them. In this regard, we will demonstrate that the ARDL model is in fact a special case of the CECM. However, recall from Part 1 of this series that one of the major advantages of the ARDL model is due to its ability to estimate the long-run or cointegrating relationship. What we expound on here, is that this estimate may not always be defined or sensible, and even if it is, it may be degenerate; that is, seemingly stable in the short-run, but dissipates in the long-run. It is here where the Bounds test comes into the limelight: it is a way of statistically detecting the presence of cointegration. The advantage of the procedure is that it uses the CECM (ARDL) as a platform. Thus, in estimating the CECM (ARDL), one can simultaneously test for cointegration and estimate the equilibrating relationship. Lastly, if cointegration does exist, one can estimate and conduct inference on the speed of convergence to equilibrium. The following flow-chart summarizes the steps:

Vector Auto-regression (VAR) and the Vector Error Correction Model (VECM)

Introduced to econometrics by Sims (1980), we formalize below a VAR model with $p$ lags, namely VAR$(p)$, augmented with the usual deterministic dynamics (intercept and trend). \begin{align} \pmb{\Phi}(L)(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t) &= \pmb{\epsilon}_t \notag \\ \pmb{\Phi}(L)\pmb{z}_t &= \pmb{\Phi}(L)\pmb{\mu} + \pmb{\Phi}(L)\pmb{\gamma}t + \pmb{\epsilon}_t \label{eq.ardl.11} \end{align} where $\pmb{z}_t$ is a $(k+1)$-vector $(y_t,x_{1,t},\ldots, x_{k,t})^\top = (y_t,\pmb{x}^\top_t)^\top$ with $\pmb{x}_t = (x_{1,t},\ldots, x_{k,t})^\top$, $\pmb{\mu}$ and $\pmb{\gamma}$ are respectively the $(k+1)$-vectors of intercept and trend coefficients, $\pmb{\Phi}(L) = \pmb{I}_{k+1} - \sum_{i=1}^{p}\pmb{\Phi}_iL^i$ is the $(k+1)$ square matrix lag polynomial, and $\pmb{I}_{k+1}$ is the identity matrix of dimension $(k+1)$, and $\pmb{\epsilon}_t = (\epsilon_{yt}, \pmb{\epsilon}_{xt}^\top)$ is the vector of innovations. We complete the setup following assumptions:

Assumption 1:

Individual Variables can be I$(0)$ or I$(1)$: The roots of $\det\left(\pmb{\Phi}(z)\right) = \det\left(\pmb{I}_{k+1} - \sum_{i=1}^{p}\pmb{\Phi}_iz^i\right) = 0$ satisfy either $|z|>1$ or $z=1$.

Assumption 2:

Variables are Correlated: The $(k+1)$-vector error process $\pmb{\epsilon}_t \sim N(\pmb{0}, \pmb{\Omega})$ with $\pmb{\Omega}$ positive definite.

Notice that Assumption 1 is the multivariate analogue of assumptions typically made for univariate AR$(p)$ processes. The assumption simply restricts $\pmb{z}_t$ to have at most one unit root in each of the series, and prevents the occurrence of seasonal and explosive roots. This allows $\pmb{z}_t$ to contain any combination of purely I$(1)$, purely I$(0)$, or mutually cointegrated variables. On the other hand, Assumption 2 restricts the errors to zero mean Gaussian processes with a covariance matrix $\pmb{\Omega}$ that allows variables in $\pmb{z}_t$ to be arbitrarily correlated. Under these assumptions, the VAR is in reduced form. This means that not only are all variables treated as endogenous, but any contemporaneous effects are exhibited through contemporaneous correlations in $\pmb{\Omega}$. While useful in its own right, a far more revelatory representation exists in the form of a vector error correction model (VECM).

Relying on the the Beveridge-Nelson (BN) decomposition and some clever rearrangement, it is readily shown that the VECM representation of the VAR in (\ref{eq.ardl.11}) is: \begin{align} \Delta\pmb{z}_t &= \left(\pmb{\Phi}(1)\pmb{\mu} + \left(\sum_{i=1}^{p}i\pmb{\Phi}_i\right)\pmb{\gamma}\right) + \pmb{\Phi}(1)\pmb{\gamma}t - \pmb{\Phi}(1)\pmb{z}_{t-1} + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t \notag \\ &= \pmb{a}_0 + \pmb{a}_1t - \pmb{\Phi}(1)\pmb{z}_{t-1} + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t \label{eq.ardl.12} \end{align} such that \begin{align} \pmb{a}_0 = \pmb{\Phi}(1)\pmb{\mu} + \left(\sum_{i=1}^{p}i\pmb{\Phi}_i\right)\pmb{\gamma} \quad \text{and} \quad \pmb{a}_1 = \pmb{\Phi}(1)\pmb{\gamma} \label{eq.ardl.13} \end{align} In fact, several important remarks emerge from this construction.

• The Cointegrating Matrix is $\pmb{\Phi}(1)$: If the original VAR variables in (\ref{eq.ardl.11}), namely $\pmb{z}_t$, are I$(1)$, all variables in the VECM are I$(0)$, except possibly for $\pmb{z}_{t-1}$. Since orders of integration must balance, $\pmb{\Phi}(1)\pmb{z}_{t-1}$ must be I$(0)$. Since a set of I$(1)$ variables is said to be cointegrated if there exists a linear combination of said variables which is I$(0)$, it is clear that $\pmb{\Phi}(1)\pmb{z}_t$ is the matrix of cointegrating relationships and $\pmb{\Phi}(1)$ is the cointegrating matrix. In Economics, the concept is often referred to as a long-run relationship, motivating the example that while prices -- which are frequently I$(1)$ variables -- can drift apart in the short-run, economic forces will eventually force them to equilibrium.

• No Cointegration when $\pmb{\Phi}(1) = \pmb{0}$. Every variable in $\pmb{z}_t$ is I$(1)$: Recall that the rank of a matrix is the number of its linearly independent columns (or rows). The concept is frequently used in ordinary least squares (OLS) regression, and is typically exemplified using the dummy variable trap. In this regard, since $\pmb{\Phi}(1)$ is a $(k+1)$-square matrix, assume $\DeclareMathOperator{\rank}{\textbf{rk}}\rank\left(\pmb{\Phi}(1)\right) = r_z$, where $0 \leq r_z \leq (k+1)$, and $\rank(\cdot)$ denotes the rank operator. In other words, among the $(k+1)$ columns in $\pmb{\Phi}(1)$, only $r_z$ are linearly independent, and the ones which are not, are linear combinations of those $r_z$. Moreover, $r_z = 0$ if and only if $\pmb{\Phi}(1) = \pmb{0}_{(1+k)^2}$, where $\pmb{0}_{(1+k)^2}$ denotes the $(1+k)$-square matrix of zeros. When this is the case, the VECM reduces to: $$\Delta\pmb{z}_t = \pmb{a}_0 + \pmb{a}_1t + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t$$ Since all variables on the right-hand side (RHS) are I$(0)$, it follows that $\Delta\pmb{z}_t \sim \text{I}(0)$, and therefore $\pmb{z}_t \sim \text{I}(1)$. In other words, when $r_z = 0$, every variable in $\pmb{z}_t$ is I$(1)$, and since $\pmb{\Phi}(1) = \pmb{0}_{(1+k)^2}$, there are no cointegrating relationships.

• No Cointegration when $\pmb{\Phi}(1)$ has full rank. Every variable in $\pmb{z}_t$ is I$(0)$: When $r_z = (k+1)$, $\pmb{\Phi}(1)$ has full column rank (i.e. all columns (rows) are linearly independent). In this particular case, $\DeclareMathOperator{\spann}{\textbf{sp}} \pmb{\Phi}(1)\pmb{z}_{t-1} = \spann{\left(\pmb{z}_{t-1}\right)}$, where $\spann(\cdot)$ denotes the span -- the space of all unique linear combinations of $\pmb{z}_t$. This implies $\Delta \pmb{z}_t$ can be uniquely written as a linear combination of all variables in $\pmb{z}_t$, namely $\pmb{\Phi}(1)\pmb{z}_{t-1}$, plus the remaining deterministic and stationary ones. Since $\Delta \pmb{z}_t \sim \text{I}(0)$, this is only sensible when every variable in $\pmb{z}_t \sim \text{I}(0)$, and cointegration is not possible.

• VECM Estimates Speed of Convergence to Equilibrium: A classical result in linear algebra is that for any $m\times m$ matrix $\pmb{M}$ with rank $r$, there exist $m \times r$ matrices $\pmb{A}$ and $\pmb{B}$ such that $\pmb{M} = \pmb{AB}^\top$, where $\pmb{B}$ consists of the $r$ linearly independent columns of $\pmb{M}$. Thus, we can always write $\pmb{\Phi}(1) = \pmb{AB}^\top$, where $m=(1+k)$. More importantly, it implies that $\pmb{A}$ measures the rate of convergence to equilibrium. To see this, recall that if $\pmb{z}_t$ is cointegrated, then $\pmb{\Phi}(1)\pmb{z}_{t-1} \sim \text{I}(0)$. We can therefore factorize the cointegrated relationships as $\pmb{\Phi}(1)\pmb{z}_{t-1} = \pmb{A}\pmb{B}^\top \pmb{z}_{t-1} = \pmb{A}\pmb{\zeta}_{t-1}$ where $\pmb{\zeta}_{t-1}$ is a mean zero I$(0)$ process. This is because the cointegrating relationships are now captured by $\pmb{B}^\top \pmb{z}_{t-1}$. Observe further that when the system is in actual equilibrium, $\pmb{B}^\top \pmb{z}_{t-1} = \pmb{0}_{1+k}$, where $\pmb{0}_{1+k}$ is $(1+k)$-vector of zeros. This is because equilibrium requires not only stability, which follows from the stationarity of $\pmb{B}^\top \pmb{z}_{t-1}$, but also constancy, which manifests only when accumulated short-run dynamics $\widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t$, and the shocks to $\pmb{B}^\top \pmb{z}_{t-1}$, namely, $\pmb{\zeta}_{t-1}$, are zero as well. Accordingly, if the system was in equilibrium in the previous period, any current deviations from this state, namely $\Delta \pmb{z}_t$, must arise from systematic shocks $\pmb{\epsilon}_t$, where we assume $\pmb{a}_0 = \pmb{a}_1 = \pmb{0}_{1+k}$ for simplicity. Alternatively, when the system is in disequilibrium, $\pmb{B}^\top \pmb{z}_{t-1} = \pmb{\zeta}_{t-1} \neq \pmb{0}_{1+k}$. Thus, when $\pmb{B}^\top \pmb{z}_{t-1} < \pmb{0}_{1+k}$ $(\pmb{B}^\top \pmb{z}_{t-1} > \pmb{0}_{1+k})$, the impact on $\Delta\pmb{z}_t$ is of magnitude $\pmb{A}$ and positive (negative), since $\pmb{\Phi}(1)$ enters the VECM with a negative sign. In other words, $\Delta\pmb{z}_t$ adjusts toward equilibrium in the opposite direction to disequilibrium by a proportion equal to $\pmb{A}$.

• Cointegrating Relationships Include Constants and Trends: We have outlined this in Part 1 of this series. The restrictions in (\ref{eq.ardl.13}) indicate that $\pmb{a}_0$ and $\pmb{a}_1$ are linear functions of $\pmb{\Phi}(1)$. As such, they span the $r_z$ linearly independent columns of the cointegrating matrix $\pmb{\Phi}(1)$, and by extension, the cointegrating equation. This distinguishes the 5 data generating processes (DGPs) considered in Pesaran, Shin, and Smith (2001) and outlined in Part 1 of this series.

• Case I: $\pmb{\mu} = \pmb{\gamma} = \pmb{0}$ which implies $\pmb{a}_0 = \pmb{a}_1 = 0$. Accordingly, the VECM (\ref{eq.ardl.12}) reduces to: $$\Delta \pmb{z}_t = -\pmb{\Phi}(1)\pmb{z}_{t-1} + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t$$
• Case II: $\pmb{\mu} \neq \pmb{0}$, $\pmb{\gamma} = \pmb{0}$, and the restriction in (\ref{eq.ardl.13}) is imposed. This implies that $\pmb{a}_0 = \pmb{\Phi}(1)\pmb{\mu}$ and $\pmb{a}_1 = 0$. Accordingly, the VECM is just: $$\Delta \pmb{z}_t = -\pmb{\Phi}(1)\left(\pmb{z}_{t-1} - \pmb{\mu}\right) + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t$$
• Case III: $\pmb{\mu} \neq \pmb{0}$, $\pmb{\gamma} = \pmb{0}$, and the restrictions in (\ref{eq.ardl.13}) not imposed. This implies that $\pmb{a}_0 \neq 0$, $\pmb{a}_1 = 0$, while the VECM becomes: $$\Delta \pmb{z}_t = \pmb{a}_0 -\pmb{\Phi}(1)\pmb{z}_{t-1} + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t$$
• Case IV: $\pmb{\mu},\pmb{\gamma} \neq \pmb{0}$, and the restrictions in (\ref{eq.ardl.13}) are imposed only on $\pmb{a}_1$. This implies that $\pmb{a}_0 \neq 0$ and $\pmb{a}_1 = \pmb{\Phi}(1)\pmb{\gamma}$. The VECM is now: $$\Delta \pmb{z}_t = \pmb{a}_0 -\pmb{\Phi}(1)\left(\pmb{z}_{t-1} - \pmb{\gamma}t\right) + \widetilde{\pmb{\Phi}}^{\star}(L)\Delta\pmb{z}_t + \pmb{\epsilon}_t$$
• Case V: $\pmb{\mu},\pmb{\gamma} \neq \pmb{0}$, and the restrictions in (\ref{eq.ardl.13}) are not imposed. This implies that $\pmb{a}_0,\pmb{a}_1 \neq 0$ and the VECM is represented in (\ref{eq.ardl.12}).

Remember that the VECM is a reparameterization of a VAR. Accordingly, the VECM quantifies adjustments to equilibrium for all variables simultaneously. Nevertheless, economists, and other practitioners, are generally only interested in one particular variable as it relates to all others. For instance, in the present context, one could be interested in studying adjustments to equilibrium of $y_t$ in response to (conditioning on) the equilibrating paths of the remaining variables $\pmb{x}_t$. Moreover, the objective is only meaningful if, after conditioning on $\pmb{x}_t$, any implications on $y_t$ that would have emerged from the original VAR model, remain unchanged under the conditional one. The concept has a very important name in cointegration theory and is known as exogeneity; see Engle, Hendry, and Richard (1983) for a technical exposition. A natural way of ensuring the concept is to restrict the total number of cointegrating relationships between $y_t$ and $\pmb{x}_t$ to be one, and exactly one, irrespective of any cointegrating paths among the $\pmb{x}_t$ themselves. Should this be the case, $\pmb{x}_t$ are said to be weakly exogenous for any parameters in the equation for $y_t$.

Accordingly, deriving a model for $y_t$ conditional on $\pmb{x}_t$ requires:

• Derive an ECM for $y_t$, explicitly conditioning on all effects originating from $\pmb{x}_t$. Such a model must include not only explicit effects of $\pmb{x}_t$ on $y_t$ stemming from the VAR matrix polynomial $\pmb{\Phi}(L)$, but also any and all contemporaneous relationships between $y_t$ and $\pmb{x}_t$ implicit within the covariance matrix $\pmb{\Omega}$ of the error vector $\pmb{\epsilon}_t$.

• Ensure that $\pmb{x}_t$ are weakly exogenous.
We turn to both these tasks next.

Conditional ECM (CECM)

To derive the conditional model, we first identify the conditional and marginal variables -- namely $y_t$ and $\pmb{x}_t$, respectively. Next, the DGP of $y_t$ is conditioned on the DGPs of the marginal variables $\pmb{x}_t$. Since any explicit relationships between $y_t$ and $\pmb{x}_t$ are clearly accounted for through $\pmb{\Phi}(L)$, any remaining conditioning proceeds on the covariance matrix $\pmb{\Omega}$. Naturally, making these relationships explicit requires a solution where the VAR is driven by a vector of innovations $\pmb{u}_t = \left(u_{yt},\pmb{\epsilon}^\top_{xt}\right)^\top$, where $\pmb{u}_t \sim N(\pmb{0},\pmb{\Sigma})$, and $\pmb{\Sigma}$ is diagonal. In other words, by virtue of Gaussianity, innovations are independent across $y_t$ and $\pmb{x}_t$. Notice that the cointegrating structure of $\pmb{x}_t$ remains unchanged here. Since to each VAR we associate a bijection into its VECM form, all operations can proceed directly on the VECM. In this regard, express (\ref{eq.ardl.12}) as follows: \begin{align} \begin{bmatrix} \Delta y_t\\ \Delta \pmb{x}_t \end{bmatrix} &= \begin{bmatrix} a_{y0}\\ \pmb{a}_{x0} \end{bmatrix} + \begin{bmatrix} a_{y1}\\ \pmb{a}_{x1} \end{bmatrix}t - \begin{bmatrix} \phi_{yy}(1) & \pmb{\phi}_{yx}(1)\\ \pmb{\phi}_{xy}(1) & \pmb{\Phi}_{xx}(1) \end{bmatrix} \begin{bmatrix} y_{t-1}\\ \pmb{x}_{t-1} \end{bmatrix} + \begin{bmatrix} \widetilde{\phi}^\star_{yy}(L) & \widetilde{\pmb{\phi}}^\star_{yx}(L)\\ \widetilde{\pmb{\phi}}^\star_{xy}(L) & \widetilde{\pmb{\Phi}}^\star_{xx}(L) \end{bmatrix} \begin{bmatrix} \Delta y_t\\ \Delta \pmb{x}_t \end{bmatrix} + \begin{bmatrix} \epsilon_{yt}\\ \pmb{\epsilon}_{xt} \end{bmatrix} \label{eq.ardl.14} \end{align} where $\pmb{a}_i = (a_{yi},\pmb{a}^\top_{xi})^\top$ for $i=0,1$, $\widetilde{\pmb{\Phi}}^\star(L) = \left(\widetilde{\pmb{\phi}}^{\star\top}_{y}(L), \widetilde{\pmb{\phi}}^{\star\top}_{x}(L)\right)^\top$, and $\pmb{\Phi}(1)$ assumes the form: \begin{align*} \pmb{\Phi}(1) = \begin{bmatrix} \phi_{yy}(1) & \pmb{\phi}_{yx}(1)\\ \pmb{\phi}_{xy}(1) & \pmb{\Phi}_{xx}(1) \end{bmatrix} \end{align*} Moreover, express the covariance matrix $\pmb{\Omega}$ as follows: \begin{align*} E\left( \begin{bmatrix} \epsilon_{yt}\\ \pmb{\epsilon}_{xt} \end{bmatrix} \begin{bmatrix} \epsilon_{yt} & \pmb{\epsilon}^\top_{xt} \end{bmatrix} \right) = \begin{bmatrix} \omega_{yy} & \pmb{\omega}_{yx}\\ \pmb{\omega}_{xy} & \pmb{\Omega}_{xx} \end{bmatrix} = \pmb{\Omega} \end{align*} It is not difficult to demonstrate that $$\epsilon_{yt} = \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\epsilon}_{xt} + u_{yt}$$ where $u_{yt} \sim N\left(0,\omega_{yy} - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\omega}_{xy}\right)$ is independent of $\pmb{\epsilon}_{xt}$. Moreover, it can then be shown that \begin{align} \Delta \pmb{z}_t &=(\pmb{I}_{k+1} - \pmb{\Psi})\left(\pmb{a}_{0} + \pmb{a}_{1}t - \pmb{\Phi}(1)\pmb{z}_{t-1} + \widetilde{\pmb{\Phi}}^\star(L) \Delta\pmb{z}_{t}\right) + \pmb{\Psi}\Delta\pmb{z}_t + \pmb{u}_{t} \label{eq.ardl.16} \end{align} where $\pmb{\alpha}_i = (\pmb{I}_{k+1} - \pmb{\Psi})\pmb{a}_i$ for $i=1,2$, $\pmb{u}_t = \left(u_{yt}, \pmb{\epsilon}^\top_{xt}\right)^\top$, and $\pmb{\Psi}$ is the matrix: \begin{align*} \pmb{\Psi} = \begin{bmatrix} 0 & \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\\ \pmb{0}_k & \pmb{0}_{k \times k} \end{bmatrix} \end{align*} Making equation (\ref{eq.ardl.16}) explicit, we arrive at: \begin{align} \begin{bmatrix} \Delta y_t\\ \Delta \pmb{x}_t \end{bmatrix} &= \begin{bmatrix} \alpha_{y0} \\ \pmb{\alpha}_{x0} \end{bmatrix} + \begin{bmatrix} \alpha_{y0} \\ \pmb{\alpha}_{x0} \end{bmatrix}t - \begin{bmatrix} \phi_{yy}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\phi}_{xy}(1) & \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\\ \pmb{\phi}_{xy}(1) & \pmb{\Phi}_{xx}(1) \end{bmatrix} \begin{bmatrix} y_{t-1}\\ \pmb{x}_{t-1} \end{bmatrix}\notag\\ &+ \left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + \begin{bmatrix} u_{yt}\\ \pmb{\epsilon}_{xt} \end{bmatrix}\label{eq.ardl.17} \end{align} It now follows that the CECM is given by the equation: \begin{align} \Delta y_t &=\alpha_{y0} + \alpha_{y1}t - \left(\phi_{yy}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\phi}_{xy}(1)\right)y_{t-1} - \left(\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\right)\pmb{x}_{t-1}\notag\\ &+ \pmb{e}_1^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + u_{yt} \label{eq.ardl.18} \end{align} the cointegrating relationship between $y_t$ and $\pmb{x}_t$, if it exists, is of the form: $$\left(\phi_{yy}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\phi}_{xy}(1)\right)y_{t-1} - \left(\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\right)\pmb{x}_{t-1}$$ and the marginal ECM is summarized as: $$\Delta \pmb{x}_t = \pmb{\alpha}_{x0} + \pmb{\alpha}_{x1}t - \pmb{\phi}_{xy}(1)y_{t-1} - \pmb{\Phi}_{xx}(1)\pmb{x}_{t-1} + \pmb{e}_2^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + \pmb{\epsilon}_{xt}$$ where $\pmb{e}_1 = \left(1,\pmb{0}_k^\top\right)^\top$ and $\pmb{e}_2 = \left(0,\pmb{I}_k^\top\right)^\top$.

It is also clear that the new cointegrating matrix is specified by $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)$. Furthermore, notice that while the system wide shocks are independent across variables, by virtue of $\pmb{\phi}_{xy}(1)y_{t-1}$, there is a feedback channel from $y_{t-1}$ into $\Delta \pmb{x}_t$. Thus, while $u_{yt}$ drives $y_t$ directly, it also indirectly drives $\pmb{x}_t$. In this regard, inference on the CECM in isolation from the marginal ECM will lead to incorrect inference; see Ericsson (1992) for an excellent overview. A natural resolution, therefore, requires $\pmb{\phi}_{xy}(1) = \pmb{0}_k$. This is a critical assumption, and one we impose now.

Assumption 3:

No feedback from $y_t$ into $\pmb{x}_t$: The $k$-vector $\pmb{\phi}_{xy}(1) = \pmb{0}_k$.

Under Assumption 3, if a cointegrating relationship between $y_t$ and $\pmb{x}_t$ exists, it can only enter through the CECM equation. Since $y_t$ is a scalar, the cointegrating relationship, should it exist, is the only one under consideration, while the cointegrating matrix reduces to: \begin{align} (\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1) &= \begin{bmatrix} \phi_{yy}(1) & \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\\ \pmb{0}_k & \pmb{\Phi}_{xx}(1) \end{bmatrix} \label{eq.ardl.19} \end{align} while the cointegrating relationship between $y_t$ and $\pmb{x}_t$, if it exists, becomes: $$\phi_{yy}(1)y_{t-1} - \left(\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\right)\pmb{x}_{t-1}$$

Relationship to ARDL

While the CECM in (\ref{eq.ardl.18}) derives from a VAR structure, the observant reader will recognize that it is in effect an ARDL model. In fact, as argued in Boswijk (2004), CECMs are special cases of their structural ECM counterparts, as such, an ARDL model can be thought of as a special case of a structural ECM. Thus, when one speaks of ARDL models in the context of cointegration, what is actually being referred to is the CECM. The relationship is made more stark by referring back to the VAR in (\ref{eq.ardl.11}). In this regard, let the lag polynomial matrix $\pmb{\eta}(L)$ satisfy $\pmb{\eta}(L)\pmb{\Phi}(L) = \pmb{\Phi}(L)\pmb{\eta}(L) = (1-L)\pmb{I}_{k+1}$, and consider the following derivations: \begin{align*} \Delta(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t) = \pmb{\eta}(L)\pmb{\Phi}(L)(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t) &=\pmb{\eta}(L)\pmb{\epsilon}_t\\ &=\pmb{\eta}(1)\pmb{\epsilon}_t + \widetilde{\pmb{\eta}}(L)\Delta\pmb{\epsilon}_t \end{align*} where the second line above follows from the BN decomposition of $\pmb{\eta}(L)$. Next, assuming without loss of generality that $\pmb{z}_0 = \pmb{\epsilon}_0 = \pmb{0}_k$, we can sum both sides of the equation above to derive: $$(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t) = \pmb{\eta}(1)\sum_{i=0}^{t}\epsilon_i + \widetilde{\pmb{\eta}}(L)\pmb{\epsilon}_t$$ where the term $\sum_{i=0}^{t}\epsilon_i$ asymptotically approaches the Brownian motion distribution after appropriate scaling. On the other hand, recall that the CECM cointegrating matrix can be expressed as $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)$. Thus, multiplying the expression above with this cointegrating matrix, we derive: \begin{align*} (\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t) &= (\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)\pmb{\eta}(1)\sum_{i=0}^{t}\epsilon_i + (\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)\widetilde{\pmb{\eta}}(L)\pmb{\epsilon}_t\\ &= (\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)\widetilde{\pmb{\eta}}(L)\pmb{\epsilon}_t \end{align*} where we have used the fact that $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)\pmb{\eta}(1) = (\pmb{I}_{k+1} - \pmb{\Psi})(1-1)\pmb{I}_{k+1} = 0$. Assumptions 1 through 3 now guarantee that, if a cointegrating relationship exists, it must be of the form $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t)$. In fact, a slightly more expressive relation emerges by rewriting the CECM as: \begin{align*} \Delta y_t &= -\phi_{yy}(1)\left(y_{t-1} - \frac{\alpha_{y0}}{\phi_{yy}(1)} - \frac{\alpha_{y1}}{\phi_{yy}(1)}t + \left(\frac{\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)}{\phi_{yy}(1)}\right)\pmb{x}_{t-1}\right)\\ &+ \pmb{e}_1^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + u_{yt} \end{align*} Since the long-run equation is known to be stationary, it now readily follows that the equilibrating (cointegrating) relationship between $y_t$ and $\pmb{x}_t$ satisfies: \begin{align} y_{t} = \frac{\alpha_{y0}}{\phi_{yy}(1)} + \frac{\alpha_{y1}}{\phi_{yy}(1)}t - \left(\frac{\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)}{\phi_{yy}(1)}\right)\pmb{x}_{t} + v_t\label{eq.ardl.20} \end{align} However, observe that the expression $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)(\pmb{z}_t - \pmb{\mu} - \pmb{\gamma}t)$ is precisely the RHS of (\ref{eq.ardl.20}), whereas $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)\widetilde{\pmb{\eta}}(L)\pmb{\epsilon}_t = v_t$. Moreover, observe that equation (\ref{eq.ardl.20}) is precisely the long-run equation one derives from the ARDL models in Pesaran and Shin (1998). More importantly, the equation is easily estimated by running OLS on the CECM (\ref{eq.ardl.18}), and deriving the long-run equation post estimation. We've outline the procedure in Part 1 of this series.

Inference

We also pause here to impose a fourth assumption which governs the cointegrating properties of the marginal vectors $\pmb{x}_t$, irrespective of a potential cointegrating relationship with $y_t$ in the CECM. In particular:

Assumption 4:

Conditional variables are mutually cointegrated: The matrix $\pmb{\Phi}_{xx}(1)$ has rank $0\leq r_{x} \leq k$.

The importance of Assumption 4 lies in the flexibility of allowing $\pmb{x}_t$ to be I$(0)$ when $r_x = k$, I$(1)$ when $r_x = 0$, or mutually cointegrated whenever $0 < r_x < k$. Again, recall that the assumption is made without regard as to whether $y_t$ and $\pmb{x}_t$ are themselves cointegrated. Accordingly, we must allow for the possibility of the system cointegrating matrix $(\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)$ to have rank $r_x$ at the very minimum. To ensure this, we note the following result from Abadir and Magnus (2005): \begin{align*} \rank\left((\pmb{I}_{k+1} - \pmb{\Psi})\pmb{\Phi}(1)\right) &= \rank\left( \begin{bmatrix} \phi_{yy}(1) & \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\\ \pmb{0}_k & \pmb{\Phi}_{xx}(1) \end{bmatrix} \right)\\ &= \begin{cases} r_x \quad &\text{if} \quad \phi_{yy}(1) = 0 \quad \text{and} \quad \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_k^\top\\ 1 + r_x \quad &\text{otherwise} \end{cases} \end{align*} In other words:

While $\pmb{x}_t$ may or may not be cointegrated among itself, there is no cointegrating relationship between $y_t$ and $\pmb{x}_t$ if and only if $\phi_{yy}(1) = 0$ and $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_k^\top$.

However, if this is indeed the case, the CECM reduces to: \begin{align*} \Delta y_t &= \alpha_{y0} + \alpha_{y1}t + \pmb{e}_1^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + u_{yt} \end{align*} Since $\Delta y_t$ is evidently a stationary process, and in the above formulation a function of stationary processes, it stands to reason that $y_t$ itself must be I$(1)$ -- in other words, while $y_t$ and $\pmb{x}_t$ are predisposed to cointegration, no cointegrating relationship exists, regardless of the cointegrating rank $r_x$ among $\pmb{x}_t$.

Thus, the null hypothesis that no cointegrating relationship between $y_t$ and $\pmb{x}_t$ exists, is: $$H_{0,F}: \quad \phi_{yy}(1) = 0 \quad \text{and} \quad \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_k^\top$$

Analysis of the Null Hypotheses

The test for $H_{0,F}$ proceeds by estimating the CECM coefficients using OLS and computing the usual $F$-statistic, $\tau_F$, associated with $H_{0,F}$ for the five cases governed by the deterministic assumptions in (\ref{eq.ardl.13}). Again, we've discussed the specifics in Part 1 of this series. Next, $\tau_F$, compared to two sets of critical values: the lower bound $\xi_{L,F}$ associated with the case $\pmb{x}_t \sim \text{I}(0)$, or $r_x = k$, and the upper bound $\xi_{U,F}$, associated with the case $\pmb{x}_t \sim \text{I}(1)$, or $r_x = 0$, where $\xi_{L,F} < \xi_{U,F}$; hence the name, bounds test. Moreover, from Pesaran, Shin, and Smith (2001), critical values for $H_{0,F}$ derive from non-standard limiting distributions. Accordingly, it bears reminding that such tests reject $H_{0,F}$ whenever $\tau_F$ is greater than some critical value. In this regard we have three outcomes:
• $\tau_F < \xi_{L,F} < \xi_{U,F}$: Here we fail to reject $H_{0,F}$ when $\pmb{x}_t$ is either I$(0)$ or I$(1)$. We are therefore assured that no cointegrating relationship between $y_t$ and $\pmb{x}_t$ exists.

• $\xi_{L,F} < \tau_F < \xi_{U,F}$: Here, $\xi_{L,F} < \tau_F$. Accordingly, we reject $H_{0,F}$ when $\pmb{x}_t \sim \text{I}(0)$. Nevertheless, since $\tau_F < \xi_{U,F}$, we fail to reject $H_{0,F}$ when $\pmb{x}_t \sim \text{I}(1)$. This indicates that cointegrating relationships between $y_t$ and $\pmb{x}_t$ may or may not exist for cases where $0 < r_x < k$. Accordingly, we cannot make any specific conclusions unless we know the rank of the system-wide cointegrating matrix (\ref{eq.ardl.19}).

• $\xi_{L,F} < \xi_{U,F} < \tau_F$: Here we reject $H_{0,F}$ when $\pmb{x}_t$ is either I$(0)$ or I$(1)$. Since $r_x = 0$ in this case, we know $\pmb{\Phi}_{xx}(1) = 0$. Moreover, since the maximal rank of the cointegrating matrix (\ref{eq.ardl.19}) is $r_z = 1 + r_x$, from the Abadir and Magnus (2005) result above, the remaining unity rank can arise from one of three possibilities:
• $\phi_{yy} = 0$ and $\pmb{\phi}_{yx}(1) \neq \pmb{0}_k^\top$ in which case the equilibrating relationship between $y_t$ and $\pmb{x}_t$ is entirely nonsensical. In fact, looking at (\ref{eq.ardl.20}), it is undefined.

• $\phi_{yy} \neq 0$ and $\pmb{\phi}_{yx}(1) = \pmb{0}_k^\top$, in which case the equilibrating relationship is defined but degenerate.

• $\phi_{yy} \neq 0$ and $\pmb{\phi}_{yx}(1) \neq \pmb{0}_k^\top$ in which case the equilibrating relationship is well defined.

This suggests an additional test for $\phi_{yy} = 0$ to exclude possibility (a) above. We discuss this in greater detail in the analysis of the alternative hypothesis below.

Analysis of the Alternative Hypotheses

Given the discussion above, if an equilibrating relationship between $y_t$ and $\pmb{x}_t$ exists, it must reside in $H_{A,F}$, where: $$H_{A,F}: \quad \phi_{yy}(1) \neq 0 \quad \text{or} \quad \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq \pmb{0}_k^\top \quad \text{or both.}$$ In fact, $H_{A,F}$ consists of three alternative specifications, as we will show below, and only one results in a non-degenerate relationship between $y_t$ and $\pmb{x}_t$. In this regard, a non-degenerate relationship must guarantee the existence and validity of the equilibrating equation in (\ref{eq.ardl.20}). In other words, it must ensure $\phi_{yy}(1) \neq 0$, otherwise the relationship is undefined, and $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq 0$, otherwise the relationship between $y_t$ and $\pmb{x}_t$ in the CECM is through $\Delta\pmb{x}_t$, and hence degenerate. We analyze the implication of these conclusions below.
• $H_{A_1,F}: \quad \phi_{yy}(1) = 0$ and $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq \pmb{0}_k^\top$.

Here, the result from Abadir and Magnus (2005) assures us that the cointegrating matrix (\ref{eq.ardl.19}) has rank $r_z = 1 + r_x$, and the CECM reduces to: $$\Delta y_t = \alpha_{y0} + \alpha_{y1}t - \left(\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)\right)\pmb{x}_{t-1} + \pmb{e}_1^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + u_{yt}$$ The cointegrating relationship (\ref{eq.ardl.20}) is here undefined since $\phi_{yy} = 0$. Moreover, since $\pmb{\Phi}_{xx}(1)$ is the only cointegrating matrix for $\pmb{x}_t$, it holds that $\pmb{\Phi}_{xx}(1)\pmb{x}_{t-1} \sim \text{I}(0)$, and therefore all RHS variables are I$(0)$ except possibly $\pmb{\phi}_{yx}(1)\pmb{x}_{t-1}$. However, since $\phi_{yy} = 0$, $\pmb{\phi}_{yx}(1)$ is not a cointegrating matrix for $\pmb{x}_t$ and therefore $\pmb{\phi}_{yx}(1)\pmb{x}_{t-1}$ may be I$(0)$ or I$(1)$. Either way, $y_t \sim \text{I}(1)$ regardless of the cointegrating rank $r_x$.

• $H_{A_2,F}: \quad \phi_{yy}(1) \neq 0$ and $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_k^\top$

In this case, the CECM assumes the form: $$\Delta y_t = \alpha_{y0} + \alpha_{y1}t - \phi_{yy}(1)y_{t-1} + \left(\widetilde{\phi}^\star_{yy}(L) -\pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\widetilde{\pmb{\phi}}^\star_{xy}(L)\right)\Delta y_t +\pmb{e}_1^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + u_{yt}$$ In fact, the equation is a special case of the Augmented Dickey-Fuller (ADF) regression. By Assumption 1, when $\pmb{\Phi}(1) = 0$, and therefore $\phi_{yy}(1) = 0$, the vector $\pmb{z}_t$, and therefore $y_t$, has a unit root. Under the alternative, however, $\phi_{yy}(1) \neq 0$ and we know that either $y_t \sim \text{I}(0)$ whenever $\alpha_{y1} = 0$, or $y_t$ is trend stationary should $\alpha_{y1} \neq 0$. Again, this holds regardless of the cointegrating rank $r_x$. Moreover, the result from Abadir and Magnus (2005) ensures that the cointegrating matrix (\ref{eq.ardl.19}) has rank $r_z = 1 + r_x$. It is important to note here that while the idea of a cointegrating relationship between $y_t$ and $\pmb{x}_t$ is not possible, there exists a relationship between $y_t$ and $\pmb{x}_t$ originating from the short-run dynamics manifesting through $\Delta \pmb{x}_t$. Since this is not an equilibrating relationship originating from $\pmb{x}_{t-1}$, the relationship is degenerate in equilibrium.

• $H_{A_3,F}: \quad \phi_{yy}(1) \neq 0$ and $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq \pmb{0}_k^\top$

Here, the result from Abadir and Magnus (2005) guarantees that $r_z = 1 + r_x$. Moreover, Abadir and Magnus (2005) ensures us there exist $(k+1\times r)$-matrices $\pmb{A}$ and $\pmb{B}$ such that one can write $\pmb{\phi}(1)$ in rank factorization as follows: \begin{align*} \begin{bmatrix} \phi_{yy}(1) & \pmb{\phi}_{yx}(1)\\ \pmb{0}_k & \pmb{\Phi}_{xx}(1) \end{bmatrix} &= \begin{bmatrix} A_{yy}\\ \pmb{0}_k \end{bmatrix} \begin{bmatrix} B_{yy} & \pmb{B}^\top_{yx} \end{bmatrix} + \begin{bmatrix} \pmb{A}_{yx}\\ \pmb{A}_{xx} \end{bmatrix} \begin{bmatrix} \pmb{0}_k & \pmb{B}^\top_{xx} \end{bmatrix}\\ &= \begin{bmatrix} A_{yy}B_{yy} & A_{yy}\pmb{B}^\top_{yx} + \pmb{A}_{yx}\pmb{B}^\top_{xx}\\ \pmb{0}_k & \pmb{A}_{xx}\pmb{B}^\top_{xx} \end{bmatrix} \end{align*} Thus, $\pmb{\phi}_{xy}(1) = A_{yy}\pmb{B}^\top_{yx} + \pmb{A}_{yx}\pmb{B}^\top_{xx}$, where $\pmb{B}_{xx}^\top$ comprises the cointegrating matrix underlying $\pmb{\Phi}_{xx}(1) = \pmb{A}_{xx}\pmb{B}^\top_{xx}$ of $\pmb{x}_t$, irrespective of $y_t$. Accordingly, any equilibrating link between $y_t$ and $\pmb{x}_t$ is due to the cointegrating matrix $\pmb{B}^\top_{yx}$. Accordingly, we have two possibilities.

• $\rank(\pmb{B}^\top_{yx},\pmb{B}^\top_{xx}) = r_x$. In this case, the cointegrating vector $\pmb{B}^\top_{yx}$ is subsumed by $\pmb{B}^\top_{xx}$ since $\rank(\pmb{\Phi}_{xx}(1)) = \rank(\pmb{B}_{xx}) = r_x$. Thus, the equilibrating relationship between $y_t$ and $\pmb{x}_t$ is not due to traditional cointegration, but is valid nonetheless. Here, $y_t \sim \text{I}(0)$ since $\phi_{yy}(1) \neq 0$.

• $\rank(\pmb{B}^\top_{yx},\pmb{B}^\top_{xx}) = 1 + r_x$. In this case, the cointegrating vector $\pmb{B}^\top_{yx}$ is not redundant, and drives the cointegrating link between $y_t$ and $\pmb{x}_t$. The equilibrating relationship is now of the traditional cointegration type, and therefore $y_t \sim \text{I}(1)$.

In either case, it is readily shown that the relationships which emerge are non-degenerate.
We can summarize the insight above as follows: $$\begin{array}{l|c|l|c} & \text{Specification} & \text{Conclusion} & \text{Integration Order} \\ \hline H_{0,F} & \phi_{yy}(1) = 0 \text{ and } \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_k^\top & \text{No equilibrating relationship.} & y_t \sim I(1)\\ &&&\\ H_{A_1,F} & \phi_{yy}(1) = 0 \text{ and } \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq \pmb{0}_k^\top & \text{Equilibrating relationship} & y_t \sim I(1)\\ & & \text{is nonsensical.} &\\ &&&\\ H_{A_2,F} & \phi_{yy}(1) \neq 0 \text{ and } \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_k^\top & \text{Equilibrating relationship} & y_t \sim I(0) \text{ or TS}\\ & & \text{is degenerate.} &\\ &&&\\ H_{A_3,F} & \phi_{yy}(1) \neq 0 \text{ and } \pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq \pmb{0}_k^\top & \text{Equilibrating relationship} & y_t \sim I(0) \text{ or } I(1)\\ & & \text{is non-degenerate.} & \end{array}$$ An important observation emerges. Notice that if we reject the null hypothesis, it is unclear which of the three alternative hypotheses manifests. Accordingly, rejecting $H_{0,F}$ does not guarantee that a non-degenerate relationship exists, or even a degenerate one! To identify the alternative (at least partially), one requires an additional test for $H_{0,t}: \phi_{yy}(1) = 0$, although in contrast to the test for $H_{0,F}$, testing $H_{0,t}$ is only sensible for cases I, III, and V of the deterministic restrictions in (\ref{eq.ardl.13}). While the usual $t$-statistic, $\tau_t$, will suffice, like $\tau_F$, its distribution is non-standard. In this regard, analogous to the limiting distributions of $\tau_F$, Pesaran, Shin, and Smith (2001) also provide sets of critical values $\xi_{L,t} < \xi_{U,t}$ for $\tau_t$, where $\xi_{L,t}$ and $\xi_{U,t}$ are derived respectively for $\pmb{x}_t \sim \text{I}(1)$ and $\pmb{x}_t \sim \text{I}(0)$. Since $\tau_t$ has a non-standard distribution, a rejection of $H_{0,t}$ requires $\tau_F$ to be greater than the appropriate critical value, or less than the negative of said critical value, since the test has a two sided alternative. Alternatively, one rejects the null hypothesis whenever the absolute value of $\tau_t$ is greater than the absolute value of the appropriate critical value. There are therefore three possibilities to consider:
• $|\tau_t| < |\xi_{L,t}| < |\xi_{U,t}|$: As before, $\pmb{x}_t$ is either I$(0)$ or I$(1)$. Moreover, since $\tau_t < \xi_{L,t}$, we fail to reject $H_{0,t}$. Since we have already rejected $H_{0,F}$, this implies $\phi_{yy}(1) = 0$ and $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) \neq \pmb{0}_k^\top$. We conclude therefore that we $H_{A,F}$ manifests as $H_{A_1,F}$ and a nonsensical equilibrating relationship between $y_t$ and $\pmb{x}_t$ emerges.

• $|\xi_{L,t}| < |\tau_t| < |\xi_{U,t}|$: Here we reject $H_{0,t}$ when $\pmb{x}_t \sim \text{I}(0)$ but fail to do so for the case where $\pmb{x}_t \sim \text{I}(1)$ and $0 < r_x < k$. Thus, examples may emerge where the $\pmb{x}_t$ are mutually cointegrated for which we may or may not reject $H_{0,t}$. Unless we know the rank of the cointegrating matrix (\ref{eq.ardl.19}), little more can be inferred.

• $|\xi_{L,t}| < |\xi_{U,t}| < |\tau_t|$: In this case, we reject $H_{0,t}$ when $\pmb{x}_t$ is either I$(0)$ or I$(1)$, implying $\phi_{yy}(1) \neq 0$. Accordingly, unless we know that $\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1) = \pmb{0}_{k}^\top$, we must conclude that $H_{A,F}$ manifests either as $H_{A_2,F}$, or $H_{A_3,F}$. In either case, an equilibrating relationship emerges, albeit degenerate in case of $H_{A_2,F}$.

The process is visualized below:

We close with a discussion on estimating adjustment to equilibrium. Recall that in the VECM (\ref{eq.ardl.12}), $\pmb{\Phi}(1)$ not only governs the cointegrating properties among $\pmb{z}_t$, but $\pmb{\Phi}(1)\pmb{z}_t = \pmb{A}\pmb{B}^\top\pmb{z}_{t-1}$, where $\pmb{A}$ is a measure of adjustment to equilibrium. To do so, one first estimates the CECM (ARDL) (\ref{eq.ardl.18}) using OLS, then proceeds to compute an estimate of the long-run equation (\ref{eq.ardl.20}) post-estimation. Let $EC_t$ denote the non-stochastic part of this equation, a variable that is typically known as the error-correction (EC) term. In other words: $$EC_t = y_t - \frac{\alpha_{y0}}{\phi_{yy}(1)} - \left(\frac{\alpha_{y1}}{\phi_{yy}(1)}\right)t + \left(\frac{\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)}{\phi_{yy}(1)}\right)\pmb{x}_{t}$$ Next, one substitutes $EC_t$ back into the CECM in place of the theoretical long-run equation to derive: \begin{align} \Delta y_t = -\phi_{yy}(1)EC_t +\pmb{e}_1^\top\left((\pmb{I}_{k+1} - \pmb{\Psi})\widetilde{\pmb{\Phi}}^\star(L) + \pmb{\Psi}\right)\Delta\pmb{z}_t + u_{yt} \label{eq.ardl.22} \end{align} Finally, one estimates the equation above using OLS again to derive an estimate of $\phi_{yy}(1)$, which is the parameter governing the speed of adjustment to equilibrium, and is analogous to the matrix $\pmb{A}$ in the original VECM. However, since one is only reparameterizing the CECM, whatever estimate is obtained for $\phi_{yy}(1)$ in the equation above, is in fact identical to the one obtained by estimating the ARDL to derive an estimate of the $EC_t$ in the first place. Thus, if one is only interested in obtaining estimate of the speed of adjustment to equilibrium, the regression above is redundant. Nevertheless, if one wishes to conduct inference on the parameter, such as a significance test, it is important to realize that the distribution involved cannot rely on the standard $t$-statistic distribution and $p$-values. To see this, observe that: $$\Delta EC_t = \Delta y_t - \frac{\alpha_{y1}}{\phi_{yy}(1)} + \left(\frac{\pmb{\phi}_{yx}(1) - \pmb{\omega}_{yx}\pmb{\Omega}^{-1}_{xx}\pmb{\Phi}_{xx}(1)}{\phi_{yy}(1)}\right)\Delta\pmb{x}_{t}$$ Next, substitute $EC_t$ and $\Delta EC_t$ into (\ref{eq.ardl.22}) and note that it can be shown that: $$\Delta EC_t = c_0 -c_1EC_{t-1} + c_2(L)\Delta EC_t + \pmb{c}_3(L)\Delta\pmb{x}_t + u_{yt}$$ where the coefficients $c_0 = \frac{\alpha_{y0}}{\phi_{yy}(1)}$, $c_2(L)$ and $\pmb{c}_3(L)$ are some lag polynomials from the coefficients of the system, and evidently, $c_1 = \phi_{yy}(1)$. Moreover, the equation is clearly a variant of the famous ADF regression for which the OLS estimate of $c_1$ is in fact an estimate of $\phi_{yy}(1)$. Nevertheless, while one easily derives the $t$-statistic for the estimate of $c_1$, since the regression is of the ADF variety, it has a non-standard limiting distribution. Accordingly, testing the null hypothesis $H_0: \phi_{yy}(1) = 0$ requires critical and $p$-values that are in accordance with the appropriate BM distributions.

Please stay tuned for our final blog entry in this series which will focus on implementing ARDL and the Bounds Test in EViews.

References:

Abadir, K.M. and Magnus, J.R. (2005). Matrix Algebra Cambridge University Press.
Boswijk, H. P. (1994). Testing for an unstable root in conditional and structural error correction models Journal of econometrics.63(1):37-60
Casella, G. and Berger R.L. (2002). Statistical Inference Duxbury Pacific Grove, CA
Engle, R.F., Hendry D.F., and Richard. J. (1983). Exogeneity Econometrica: Journal of the Econometric Society277-304
Ericsson, N.R. (1992). Cointegration, exogeneity, and policy analysis: An overview. Journal of policy modeling13(3)251-280
Pesaran, M. H. and Shin, Y. (1998). An autoregressive distributed-lag modelling approach to cointegration analysis. Econometric Society Monographs, 31:371--413.
Pesaran, M. H., Shin, Y., and Smith, R. J. (2001). Bounds testing approaches to the analysis of level relationships. Journal of applied econometrics, 16(3):289--326.
Sims, C.A. (1980). Macroeconomics and reality Econometrica: Journal of the Econometric Society, 1-48

1. Thank you!

2. Terrific post - looking forward to the final one in this series.

1. hallo dr Giles .. Do you have any articles about panel-ARDL ... and thank you

3. Excellent one. Now waiting for that to produce empiric in Eviews :)

4. It will be great if we know a tentative release date of the final one. I am not sending my manuscript to journal without checking it.

1. Next week, or sooner.

3. It will be this week, not sooner.

5. my professor said to me that's not correcte when we use "t statistic" in ARDL model it's true !!!

6. I just want to point out that in the last flowchart, with the respect to the null hypothesis, it may be more appropriate to write "Do not reject" than "Accept".

7. thak you very much .... what abour panel-ARDL !

1. We will be producing similar blog posts on theoretical topics in the future, but topics and schedule will be somewhat ad-hoc.

I'll point out that there is really little relationship between the Bounds Test use of ARDL and Panel ARDL models, other than the name, so it doesn't immediately follow that panel ARDL would be discussed simply because of these posts.

2. thank you very much sir

8. In discussion of the 3 alternative specifications that are permitted by the F test, in each case the above states "Here, the result from Abadir and Magnus (2005) assures us that the cointegrating matrix (8) has rank rz=1+rx" But PSS state that the rank of the long-run multiplier matrix Pi may be either r_x or r_x+1 under the alternative hypothesis. Specifically it is r_x if phi_yy(1)=0 and r_x+1 if phi_yy(1) is not equal to zero. How can both be true? -- Thanks

9. Dear all,
why testing H_0,t is only sensible for cases I, III, and V of determninistic restriction? What about II case, for example?

thanks a lot

1. Firstly, the H_0,t test is a t-test on the exclusion of y_(t-1) from the DGP under consideration. It was extensively studied in Banerjee, Dolado, Mestre (1998), and is therefore often called the BDM test.

To answer your question formally would require a huge amount of theory. However, in a nutshell, both BDM (1998) and PSS (2001) papers argue that asymptotically, the t-statistic cannot distinguish between Cases I and II, and III and IV, respectively. In other words, whether deterministic restrictions are imposed or not, the limiting distribution of the t-statistic will be invariant to these restrictions.

10. I like science in general. For example, I work in hospital and use this to write nursing essays. And I also like math and biology.

11. (Assumption 3)No feedback from
yt into x t
:Does that mean that if there is bi-directional causality (x->y and Y->X), ARDL can not be used? If so , there are many papers reporting two way causality (shown by Granger test as well as Vector ARDL tests) published by Science Direct that still use ARDL.

1. ARDL assumes that there is at most ONE cointegrating relationship, and hence Assumption 3: no feedback from y_t into x_t. In other words, if you suspect that there may be more than one cointegrating relationship, applying ARDL is not appropriate, and the Johansen cointegrating approach should be used. I don't know what papers you are referring to, but if they are not respecting the assumptions of the ARDL cointegrating model, their analysis will not be correct.

12. hi , I noticed that several articles run an ardl model without knowing the cointegration rink in the first place with eviews 9. is that possible !!! because i think that the rink of cointergration must be known first before runing any model !!!!

1. ARDL assumes that the cointegration rank (if cointegration exists) is always 1. In other words, it assumes that there exists at most a single long-run relationship.

13. hi pls when i must use Schwarz Criterion (sc) and when Akaike's Information Criterion (aic) ? and what is the difference between them ?

14. This comment has been removed by the author.

15. could someone tell me the difference between var, vecm, ardl concisely?

16. Dear all,

I detected heteroskedasticity in my model. I applied the HAC covariance matrix, but the problem continues. After applied the test again, both f-statistic and p-value havent changed.
Any idea what it is happening or how I could solve the heteroskedasticity problem?
Thanks,
Aline