Tuesday, August 8, 2017

Dumitrescu-Hurlin Panel Granger Causality Tests: A Monte Carlo Study

With data availability at its historical peak, time series panel econometrics is in the limelight. Unlike traditional panel data in which each cross section $i = 1, \ldots, N$ is associated with $t=1, \ldots, T < N$ observations, what characterizes time series panel data is that $N$ and $T$ can both be very large. Moreover, the time dimension also gives rise to temporal dynamic information and with it, the ability to test for serial correlation, unit roots, cointegration, and in this regard, also Granger causality.

Our focus in this post is on Granger causality tests; rather, on a popular panel version of the test proposed in Dumitrescu and Hurlin (2012) (DH). Below, we summarize Granger causality testing in the univariate case, follow the discussion on the panel version of the test, and close with our findings from a large Monte Carlo simulation replicating and extending the work of DH to cases which were not covered in the original article. In particular, our focus is on studying the impact on size and power when the regression lag order is misspecified relative to the lag order characterizing the true data generating process (DGP).

Granger Causality Tests

The idea behind Granger causality is simple. Given two temporal events, $x_t$ and $y_t$, we say $x_t$ Granger causes $y_t$, if past information in $x_t$ uniquely contributes to future information in $y_t$. In other words, information in $\left\{ x_{t-1}, x_{t-2}, \ldots \right\}$ has predictive power for $y_t$, and knowing both $\left\{ x_{t-1}, x_{t-2}, \ldots \right\}$ and $\left\{ y_{t-1}, y_{t-2}, \ldots \right\}$ together, yields better forecasts of $y_t$ than knowing $\left\{ y_{t-1}, y_{t-2}, \ldots \right\}$ alone.

In the context of classical, non-panel data, testing whether $x_t$ Granger causes $y_t$ reduces to parameter significance on the lagged values of $x_t$ in the regression: \begin{align} y_t = c + \gamma_1 y_{t-1} + \gamma_2 y_{t-2} + \cdots + \gamma_p y_{t-p} + \beta_1 x_{t-1} + \beta_2 x_{t-2} + \cdots + \beta_p x_{t-p} + \epsilon_t \label{eq.1} \end{align} where $\epsilon_t$ satisfies the classical assumptions of being independent and identically distributed, the roots of the characteristic equation $1 - \gamma_1r - \gamma_2r^2 - \ldots - \gamma_p r^p = 0$ lie outside the unit circle, namely, $y_t$ is stationary, $x_t$ is stationary itself, and, $p \geq 1$. In other words, we have the following null and alternative hypothesis setup: \begin{align*} H_0: \quad &\forall k\geq 1, \quad \beta_k = 0; \quad \text{$x_t$ does not Granger cause $y_t$.}\\ H_A: \quad &\exists k\geq 1, \quad \beta_k \neq 0; \quad \text{$x_t$ does Granger cause $y_t$.} \end{align*} Although the traditional Granger causality test is only valid for stationary series, we diverge briefly to caution on cases where $x_t$ and $y_t$ may be non-stationary. In particular, whenever at least one variable in the regression above is not stationary, the traditional approach is no longer valid. In such cases one must resort to the approach of Toda and Yamamoto (1995). In this regard, we also emphasize that unlike non-stationary but non-cointegrated variables, which may or may not exhibit Granger causality, all cointegrated variables necessarily Granger cause each other in at least one direction, and possibly both. Since our friend Dave Giles has exceptional posts on the subjects here, here, and here, we will not delve further and urge interested readers to refer to the material in these posts.

Dumitrescu-Hurlin Test: Panel Granger Causality Test

Recall that time series panel data associates a cross-section $i=1, \ldots, N$ for each time observation $t=1,\ldots T$. In this regard, a natural extension of the Granger causality regression (\ref{eq.1}) to cross-sectional information, would assume the form: \begin{align} y_{i,t} = c_i + \gamma_{i,1} y_{i,t-1} + \gamma_{i,2} y_{i,t-2} + \cdots + \gamma_{i,p} y_{i,t-p} + \beta_{i,1} x_{i,t-1} + \beta_{i,2} x_{i,t-2} + \cdots + \beta_{i,p} x_{i,t-p} + \epsilon_{i,t} \label{eq.2} \end{align} where now, we require the roots of the characteristic equations $1 - \gamma_{i,1}r_i - \gamma_{i,2}r_i^2 - \ldots - \gamma_{i,p} r_i^p = 0$ to be outside the unit circle for all $i=1,\ldots, N$, in addition to requiring stationarity from $x_{i,t}$ for all $i$. Moreover, we assume $\epsilon_{i,t}$ are independent and normally distributed across both $i$ and $t$; namely, $E(\epsilon_{i,t})=0$, $E(\epsilon_{i,t}^2)=\sigma_i^2$, and $E(\epsilon_{i,t}\epsilon_{j,s}) = 0$ for all $i\neq j$ and $s\neq t$. In other words, we exclude the possibility of cross-sectional dependence and serial correlation across $t$. While restrictive, relaxing these assumptions is still in theoretical development so we restrict ourselves to the aforementioned specification.

At this point, it is instructive to reflect on what the presence and absence of Granger causality in panel data actually means. In this regard, while the absence of Granger causality is as simple as requiring non-causality across all cross-sections simultaneously, namely: $$H_0: \quad \text{$\forall k\geq 1$ and $\forall i$,} \quad \beta_{i,k} = 0; \quad \text{$x_{i,t}$ does not Granger cause $y_{i,t}$, } \forall i$$ the alternative hypothesis, namely the presence of Granger causality, is more involved. In particular, are we to assume presence of Granger causality implies causality across all cross sections simultaneously, namely, $$H_{A_1}: \quad \text{$\forall k\geq 1$, and $\forall i$,} \quad \beta_{i,k} \neq 0; \quad \text{$x_{i,t}$ does Granger cause $y_{i,t}$, } \forall i$$ or, are we to hypothesize the presence of Granger causality as causality that is present for some proportion of the cross-sectional structure; in other words: \begin{align*} H_{A_2}: &\quad \text{$\forall k\geq 1$ and $\forall i=1, \ldots, N_1$,} \quad \beta_{i,k} = 0; \quad \text{$x_{i,t}$ does not Granger cause $y_{i,t}$, } \forall i \leq N_1\\ &\quad \text{$\forall i=N_1+1, \ldots, N$, $\exists k\geq 1$,} \quad \beta_{i,k} \neq 0; \quad \text{$x_{i,t}$ Granger cause $y_{i,t}$ for $i>N_1$.} \end{align*} where $0\leq N_1/N < 1$. Since $H_{A_1}$ is evidently restrictive, we focus here on $H_{A_2}$. In particular, the theory for a panel Granger causality test in which $H_0$ is contrasted with $H_{A_2}$ is the foundation of the popular work of Dumitrescu and Hurlin (2012). In fact, the approach taken follows closely the work of Im, Pesaran, and Shin (2003) for panel unit root tests in heterogenous panels. In particular, estimation proceeds in three steps:
  1. For each $i$ and $t=1, \ldots, T$, estimate the regression in (\ref{eq.2}) using standard OLS.

  2. For each $i$, using the estimates in Step 1, conduct a Wald test for the hypothesis $\beta_{i,k}=0$ for all $k=1, \ldots, p$, and save this value as $W_{i,T}$.

  3. Using the $N$ statistics $W_{i,T}$ from Step 2, form the aggregate panel version of the statistic as: \begin{align} W_{N,T} = \frac{1}{N}\sum_{i=1}^{N}W_{i,T} \label{eq.3} \end{align}

It is important to remark here that in steps 1 and 2, although one may observe $t=1, \ldots T$ values for $x_{i,t}$ and $y_{i,t}$, due to the autoregressive nature of the regression, the effective sample size will always be $t=1, \ldots, (T-p)$ to account for the fact that one needs $p$ initializing values for each of the variables.

Given the test statistic (\ref{eq.3}), DH demonstrate its limiting distribution when $T\longrightarrow \infty$ followed by $N\longrightarrow \infty$, denoted as $T,N \longrightarrow \infty$; in addition to the case where $N\longrightarrow \infty$ with $T$ fixed. The results are summarized below: \begin{align*} Z_{N,T} &= \sqrt{\frac{N}{2K}} \left(W_{N,T} - K\right) \quad \overset{d}{\underset{T,N \rightarrow \infty}\longrightarrow} \quad N(0,1)\\ \widetilde{Z}_{N} &= \sqrt{\frac{N(T-3K-5)}{2K(T-2K-3)}} \left(\left(\frac{T-3K-3}{T-3K-1}\right)W_{N,T} - K\right) \quad \overset{d}{\underset{N \rightarrow \infty}\longrightarrow} \quad N(0,1) \end{align*} provided $T > 5 + 3K$ as a necessary condition for the validity of results. The latter ensures that the OLS regression in Step 1 above is valid, by preventing situations in which there are more parameters than observations.

In either case, the results follow from classical statistical concepts and central limit theorems (CLT). In particular, in the case where $T,N \longrightarrow \infty$, observe that $W_{i,T} \overset{d}{\underset{T \rightarrow \infty}\longrightarrow} \chi^2(k)$ for every $i$. Accordingly, one is left with $N$ independent and identically distributed random variables, each with mean $K$ and variance $2K$. Thus, the classical Lindberg-Levy CLT applies, and the first limiting result follows. For the second case, DH demonstrate that when $T$ is fixed, $W_{i,T}$ represent $N$ independent random variables but each has mean $\frac{K(T-3K-1)}{T-3K-3}$ and variance $\frac{2K(T-3K-1)^2(T-2K-3)}{(T-3K-3)^2(T-3K-5)}$, and so they are not identically distributed. In this case, one can invoke the Lyapunov CLT, and the second result follows. Of course, it follows readily that as $T\longrightarrow \infty$, both limiting results coincide. We refer interested readers to the original DH article for details.

EViews has allowed estimation of the Dumitrescu-Hurlin test as a built in procedure since EViews 8. Dumitrescu and Hurlin have also made available a set of Matlab routines to perform their test and a companion website. In recent months, a Stata ado file allowing estimation of the test has also been made available. It should be noted that due to slight calculation errors in the original Matlab and Stata code, EViews results did not always match those given by Matlab and Stata. In recent months those mistakes have been fixed by the respective authors, and now both Matlab and Stata match the results produced in EViews.

In EViews, the test is virtually instant. Proceeding from an EViews workfile with a panel structure, open two variables, say $x_t$ and $y_t$ as a group, proceed to View/Granger Causality, select Dumitrescu Hurlin, specify the number of lags to use, namely, set $p$, and hit OK.



The output will look something like this.



In particular, EViews presents the global panel statistic $W_{N,T}$ as W-Stat, the standardized statistic $\widetilde{Z}_{N,T}$ as Zbar-Stat, and corresponding $p$-values based on the N$(0,1)$ limiting distribution presented in case two earlier. Notice that EViews does not present the asymptotic result $Z_{N,T}$. This is a conscious decision since we will show below that almost in all circumstances of interest, the version in which $T$ remains fixed, tends to outperform the one in which $T\longrightarrow \infty$, except for very large $T$.

Dumitrescu-Hurlin Test: Monte Carlo Study

We close our post with findings from our extensive Monte Carlo study of the Dumitrescu and Hurlin (2012) panel Granger causality test. Although the authors conducted a simulation study of their own, we were disappointed that more emphasis was not placed on the impact of incorrectly specifying the lag order $p$ in the Granger causality regression (\ref{eq.2}). In this regard, we wrote an EViews program to study both size and power under the following configurations:
  • Monte Carlo replications: $5000$

  • Sample sizes considered: $T=11,20,50,100,250$

  • Cross-sections considered: $N=1,5,10,25,50$

  • Regression lags considered: $p=1, \ldots, 7$

  • Hypothesis configurations (includes $H_0$): $N_1/N = 0, 25, 50, 75, 1$

  • Statistics Used: $Z_{N,T}$ and $\widetilde{Z}_{N}$

The study uses the same Monte Carlo framework proposed in Dumitrescu and Hurlin (2012). In particular, data is generated according to $H_0$ and $H_{A_2}$ for the regression equation (\ref{eq.2}), followed by estimation in which lag specifications may or may not coincide with the lag structure underlying the true DGP. Moreover, whereas each of the configurations above is available from the study, we isolate a few scenarios to illustrate our main findings:
  • First, both size and power drastically improve with increased sample size $T$, for all possible configurations. This effect is evidently more pronounced using the asymptotic statistic $Z_{N,T}$ since $\widetilde{Z}_{N}$ a priori accounts for the finiteness of $T$.





  • Second, for each lag selection $p$ and cross-section specification $N$ (with the exception of N=1), size improves as $N$ decreases, whereas power improves as $N$ increases. On the other hand, the improvement in power due to increasing $N$ can be drastically more pronounced and varied relative to the decrease in size from the same effect. This effect is much less pronounced for size, and much more pronounced for power when considering the $\widetilde{Z}_N$ statistic.





  • Lastly, the sensitivity of the test to misspecification of the regression lag length $p$ can be severe! In fact, our results show that size distortion is smallest with $p=1$, regardless of what the true underlying DGP is. While particularly evident in the case of the $Z_{N,T}$ statistic, the effect is somewhat less pronounced for the $\widetilde{Z}_N$ version of the test. In contrast, the test can be grossly underpowered whenever the regression lag $p$ deviates from the lag structure characterizing the true DGP. In particular, if $k$ is the number of lags in the true DGP, and $p$ is the number of regression lags selected, the test is severely underpowered for all $p < k$ and improves as $p$ approaches $k$, although if $p > k$, the effect is not nearly as severe, and virtually unnoticeable.





The general takeaway is this: the Dumitrescu and Hurlin (2012) test achieves best size when regression lags $p$ are smallest (regardless of the underlying true AR structure), whereas it achieves best power when $p$ matches the true AR structure, where the penalty for underspecifying $p$ can be severe. This trade off between selecting lower regression lags for size and higher for power, evidently calls for theoretical or practical guidance for correctly identifying the regression lags to be used in testing. Although Dumitrescu and Hurlin (2012) offer no such suggestion in their own paper, it is not difficult to see the potential of model selection criteria to mitigate the issue. Choosing the correct method of model selection is potentially problematic, and further simulation work demonstrating the appropriate method of model selection would be recommended.

If you would like to conduct your own simulations, you can find the entire code (mostly commented), here.

No comments:

Post a Comment