*Author and guest blog by Michael Anthonisz, Queensland Treasury Corporation.*

*In this blog post, Michael demonstrates the use of MIDAS in EViews to nowcast Australian GDP growth on a daily basis.*

*"Nowcasts" are forecasts of the here and now ("now" + "forecast" = "nowcast"). They are forecasts of the present, the near future or the recent past. Specifically, nowcasts allow for real-time tracking or forecasting of a lower frequency variable based on other series which are released at a similar or higher frequency.*

For example, one could try to forecast the outcome for the current quarter GDP release using a combination of daily, weekly, monthly and quarterly data. In this example, the nowcast could be updated on a daily basis – the highest frequency of explanatory data – as new releases for the series being used to explain GDP came in. That is, as the daily, weekly, monthly and quarterly data used to explain GDP is released, the nowcast for current quarter GDP is updated in real-time on a daily basis.

The ability to update one's forecast incrementally in real-time in response to incoming information is an attractive feature of nowcasting models. Forecasting in this manner will lower the likelihood of one's forecasts becoming "stale". Indeed, nowcasts have been found to be more accurate:

- at short-term horizons.
- as the period of interest (eg, the current quarter) goes on.
- than traditional forecasting approaches at these horizons.

- they also perform similarly to private sector forecasters who are able to also incorporate information in real-time.
- there are mixed findings as to relative gains from including high frequency financial data.
- "soft data"
^{1}is most useful early on in the nowcasting cycle and "hard data"^{2}is of more use later on.

- Bayesian vector autoregressions (for example, Bok et al 2017).
- Factor-augmented autoregressive models (for example, Grui & Lysenko, 2017).
- Mixed Frequency VARs (for example, Giannone, Reichlin & Small, 2008).
- MIDAS (Mixed Data Sampling) (for example, Clements & Galvao, 2007).
- Accounting-based tracking models
^{3}(for example, Higgins, 2014). - Bridge equations
^{4}(for example, Ferrara & Simoni, 2018).

^{5}. MIDAS models are perfectly suited to handle the nowcasting problem, which at its essence, relates to how to use data for explanatory variables which are released at different frequencies to explain the dependent variable

^{6}.

^{7}. This allows us to extract a common trend from a large number of series. Using this approach will enable us to cut down on "noise" and hopefully use more "signal" to estimate GDP.

- Create separate tabs in the workfie which correspond to the different frequencies of underlying data you are using.
- Import the underlying data and normalize to be in Z Score form (that is, mean of zero and variance of one) before running the PCA.
- Have the common factors created from the PCA appear on the relevant tab in the workfile
^{8}. - Clean the data to get rid of any N/A values for data that has not yet been published.
^{9} - Re-run the PCA to reflect that you now have data for the underlying series for the full sample period.

- Let's imagine it is currently 1 July 2018.
- We’re interested in forecasting Q3 2018 GDP using one period lags of GDP and the common factors estimated earlier via PCA. These are quarterly representations of conditions with respect to labour markets and capital investment as well as measures of current and future economic activity. We’ll also using bond yields and the trade-weighted exchange rate, both of which are available on a daily basis.
- In our MIDAS model, quarterly GDP is the dependent variable and the aforementioned other variables are independent variables. The model is estimated using historical data from Q2 1993 until Q2 2018 (as it is 1 July we have data to 30 June).
- As we want to forecast Q3, and have data on our daily variables until the end of Q2 2018, we can specify the equation as each quarter’s GDP growth is a function of the previous quarter’s outcomes for the quarterly variable and of (say) the last 45 days’ worth of values for bond yields and the exchange rate ending on the last day of the previous quarter.
- Having estimated the model, we can use the 45 daily values for bond yields and the exchange rate from May to June 2018 to forecast Q3 GDP.
- Now, assume the calendar has turned over and it is now 2 July 2018. We have one more observation for the daily series. We can update the forecast of GDP by estimating a new model on historical data that used 44 days from the previous quarter and the first day from the current quarter, and then forecast Q3 GDP.
- Then, assume it is 3 July 2018. We can now update our forecast by estimating on 43 days of the previous quarter and the first 2 days from the current quarter. And so on.
- We will end up with a forecast of quarterly GDP that is updated daily. That doesn't make it a forecast of daily GDP as it is a quarterly variable. We're just able to forecast it using current (now) data and update this forecast continuously on a daily basis.

Figure 1: Independent variables used in MIDAS estimation (click to enlarge) |

- gdp_q_trend_3m_chg = quarterly change in the trend measure of Australian GDP.
- gdp_q_trend_3m_chg(-1) = one quarter lag of the quarterly change in the trend measure of Australian GDP.
- activity_current(-1) = one quarter lag of a PCA derived factor representing current economic activity in Australia.
- activity_leading(-1) = one quarter lag of a PCA derived factor representing future economic activity in Australia.
- investment(-1) = one quarter lag of a PCA derived factor representing capital investment in Australia.
- labour_market(-1) = one quarter lag of a PCA derived factor representing labour market conditions in Australia.
- au_midas_daily\atwi_final(-1) = the lag of the trade-weighted Australia Dollar where this data is located on a page with a daily frequency.
- au_midas_daily\gacgb3_final(-1) = the lag of the three-year Australian sovereign bond yield where this data is located on a page with a daily frequency.

Figure 2: Estimation specification (click to enlarge) |

Figure 3: Estimation output (click to enlarge) |

- 2018Q2 for our dependent variable.
- 2018Q1 for our quarterly independent variables (since they are all lagged one period).
- May 30th for our daily independent variables (a one day lag from the last day of Q2). Also note that since we are using 45 daily periods for each quarter, the 2018Q2 data point is estimated using data from March 29th - May 30th (we are dealing with regular 5-day data).

Figure 4: Forecast dialog (click to enlarge) |

- 2018Q2 for our quarterly independent variables (since they are all lagged one period).
- July 30th 2018 - September 28th 2018 for our daily independent variables (45 days ending on the last day of Q3 2018 - September 29th/30th are a weekend, so not included in our workfile).

Figure 5: Daily updated forecast of Australian GDP Trend (click to expand) |

**1 **Such as consumer or business surveys^{↩}

**2 **Such a retail spending, housing or labour market data^{↩}

**3 **As GDP, for example, is essentially an accounting identity that represents the sum of different income, expenditure or production measures, it can be calculated using a ‘bottom-up’ approach in which series that proxy for the various components of GDP are used to construct an estimate of it using an accounting type approach.^{↩}

**4 **Bridge equations are regressions which relate low frequency variables (e.g. quarterly GDP) to higher frequency variables (eg, the unemployment rate) where the higher frequency observations are aggregated to the quarterly frequency. It is often the case that some but not all of the higher frequency variables are available at the end of the quarter of interest. Therefore, the monthly variables which aren’t as yet available are forecasted using auxiliary models (eg, ARIMA). ^{↩}

**5 **Papers using a daily frequency in mixed frequency regression analyses include Andreou, Ghsels & Kourtellos, 2010, Tay, 2006 and Sheen, Truck & Wang, 2015.^{↩}

**6 **MIDAS models use distributed lags of explanatory variables which are sampled at an equivalent or higher frequency to the dependent variable. A distributed lag polynomial is used to ensure a parsimonious specification. There are different types of lag polynomial structures available in EViews. Lindgren & Nilson, 2015 discuss the forecasting performance of the different polynomial lag structures.^{↩}

**7 **See here and here for background and here and here for how to do in EViews.^{↩}

**8 **For example, underlying data on a monthly and quarterly basis will generate a common factor that is on a quarterly basis. This should therefore go on a quarterly workfile tab.^{↩}

**9 **For example, if there was an NA then you could choose to use the previous value for the latest date instead. For example, X_full series = @recode(X =na, X(-1), X)^{↩}

Great post - thanks for this!

ReplyDeleteI have been usignoring EViews to model MIDAS regressions and I have absolutely feLlandudno in love. Thank you very much for this post, it shines a lot of light.

ReplyDeleteA question:

The p-values associated with the coefficients of the three PDL polynomial degrees, does their level of significance matter in forecasting?

Moreover, gaussian properties of the residuals, is it a must that they be satisfied in this technique before forecasting can take place?

Thank you again for a wonderful post!

It is often argued that p-values and normality are irrelevant to forecasting in OLS, so that goes for MIDAS too.

Delete