Tuesday, December 4, 2018

Nowcasting GDP on a Daily Basis

Author and guest blog by Michael Anthonisz, Queensland Treasury Corporation.
In this blog post, Michael demonstrates the use of MIDAS in EViews to nowcast Australian GDP growth on a daily basis.

"Nowcasts" are forecasts of the here and now ("now" + "forecast" = "nowcast"). They are forecasts of the present, the near future or the recent past. Specifically, nowcasts allow for real-time tracking or forecasting of a lower frequency variable based on other series which are released at a similar or higher frequency.


For example, one could try to forecast the outcome for the current quarter GDP release using a combination of daily, weekly, monthly and quarterly data. In this example, the nowcast could be updated on a daily basis – the highest frequency of explanatory data – as new releases for the series being used to explain GDP came in. That is, as the daily, weekly, monthly and quarterly data used to explain GDP is released, the nowcast for current quarter GDP is updated in real-time on a daily basis.

The ability to update one's forecast incrementally in real-time in response to incoming information is an attractive feature of nowcasting models. Forecasting in this manner will lower the likelihood of one's forecasts becoming "stale". Indeed, nowcasts have been found to be more accurate:

  • at short-term horizons.
  • as the period of interest (eg, the current quarter) goes on.
  • than traditional forecasting approaches at these horizons.
Other key findings in relation to nowcasts are that:
  • they also perform similarly to private sector forecasters who are able to also incorporate information in real-time.
  • there are mixed findings as to relative gains from including high frequency financial data.
  • "soft data"1 is most useful early on in the nowcasting cycle and "hard data"2 is of more use later on.
There are a number of approaches that can be used to prepare a nowcast including:

Through its broad functionality EViews is able to facilitate the use of all of these approaches. For the purposes of this blog entry and in recognition of its availability from EViews 9.5 onwards as well as its ease of use, MIDAS regressions will be used to provide a daily nowcast of quarterly trend Australian real GDP growth5. MIDAS models are perfectly suited to handle the nowcasting problem, which at its essence, relates to how to use data for explanatory variables which are released at different frequencies to explain the dependent variable6.

In this example, the series used in the MIDAS model to nowcast GDP are not just regular economic or financial time series, however. To capture as broad a variety of influences on the dependent variable as possible, as well as to ensure a parsimonious specification, principal components analysis ("PCA") is used7. This allows us to extract a common trend from a large number of series. Using this approach will enable us to cut down on "noise" and hopefully use more "signal" to estimate GDP.

The data series used to derive these common factors are compiled on a monthly and quarterly basis and are released in advance of, during and following the completion of the current quarter of interest with respect to GDP. The common factors are calculated at the lowest frequency of the underlying data (quarterly) and are complemented in the model by daily financial data which may have some explanatory power over the quarterly change in Australian GDP (for example, the trade weighted exchange rate and the three-year sovereign bond yield).

An outline of the steps required to do this sort of MIDAS-based nowcast is below. Keep in mind the helpful point and click as well as command language instructions published by EViews which provide more detail.
  • Create separate tabs in the workfie which correspond to the different frequencies of underlying data you are using.
  • Import the underlying data and normalize to be in Z Score form (that is, mean of zero and variance of one) before running the PCA.
  • Have the common factors created from the PCA appear on the relevant tab in the workfile8.
  • Clean the data to get rid of any N/A values for data that has not yet been published.9
  • Re-run the PCA to reflect that you now have data for the underlying series for the full sample period.
It is important to note that the variable being nowcast must actually be forecast with the same periodicity as its release. In this instance, GDP is released quarterly so our forecasts of it will be quarterly as well. This means all the work at this stage of the estimation will be done on the quarterly page. We are aiming
to produce forecasts of a quarterly variable which are updated on a more real-time (that is, daily basis) but are not actually producing a forecast of daily GDP.

An illustration of the rolling process might make this clearer. For instance:
  • Let's imagine it is currently 1 July 2018.
  • We’re interested in forecasting Q3 2018 GDP using one period lags of GDP and the common factors estimated earlier via PCA. These are quarterly representations of conditions with respect to labour markets and capital investment as well as measures of current and future economic activity. We’ll also using bond yields and the trade-weighted exchange rate, both of which are available on a daily basis.
  • In our MIDAS model, quarterly GDP is the dependent variable and the aforementioned other variables are independent variables. The model is estimated using historical data from Q2 1993 until Q2 2018 (as it is 1 July we have data to 30 June).
  • As we want to forecast Q3, and have data on our daily variables until the end of Q2 2018, we can specify the equation as each quarter’s GDP growth is a function of the previous quarter’s outcomes for the quarterly variable and of (say) the last 45 days’ worth of values for bond yields and the exchange rate ending on the last day of the previous quarter.
  • Having estimated the model, we can use the 45 daily values for bond yields and the exchange rate from May to June 2018 to forecast Q3 GDP.
  • Now, assume the calendar has turned over and it is now 2 July 2018. We have one more observation for the daily series. We can update the forecast of GDP by estimating a new model on historical data that used 44 days from the previous quarter and the first day from the current quarter, and then forecast Q3 GDP.
  • Then, assume it is 3 July 2018. We can now update our forecast by estimating on 43 days of the previous quarter and the first 2 days from the current quarter. And so on.
  • We will end up with a forecast of quarterly GDP that is updated daily. That doesn't make it a forecast of daily GDP as it is a quarterly variable. We're just able to forecast it using current (now) data and update this forecast continuously on a daily basis.
For our concrete example using Australian macroeconomic variables, we will estimate a MIDAS model where the dependent variable is the quarterly change in the trend measure of Australian real GDP.

The independent variables of the model can be seen in Figure 1:
Figure 1: Independent variables used in MIDAS estimation (click to enlarge)
All data are sourced from the Bloomberg and Thomson Reuters Datastream databases, accessible via EViews.

The specific equation in EViews is estimated using the Equation object with the method set to MIDAS, and with variable names of:
  • gdp_q_trend_3m_chg = quarterly change in the trend measure of Australian GDP.
  • gdp_q_trend_3m_chg(-1) = one quarter lag of the quarterly change in the trend measure of Australian GDP.
  • activity_current(-1) = one quarter lag of a PCA derived factor representing current economic activity in Australia.
  • activity_leading(-1) = one quarter lag of a PCA derived factor representing future economic activity in Australia.
  • investment(-1) = one quarter lag of a PCA derived factor representing capital investment in Australia.
  • labour_market(-1) = one quarter lag of a PCA derived factor representing labour market conditions in Australia.
  • au_midas_daily\atwi_final(-1) = the lag of the trade-weighted Australia Dollar where this data is located on a page with a daily frequency.
  • au_midas_daily\gacgb3_final(-1) = the lag of the three-year Australian sovereign bond yield where this data is located on a page with a daily frequency.
In this example we will estimate the dependent variable using historical data from Q2 1993 until Q2 2018. From this we can then do forecasts for the current quarter (in this case Q3 2018) whereby the dependent variable is a function of the previous quarter’s outcomes for the quarterly independent variables and of the last 45 days’ worth of values for bond yields and the exchange rate. The MIDAS equation estimation window that reflects this would be as follows:
Figure 2: Estimation specification (click to enlarge)

Running the MIDAS model results in the following estimation output:
Figure 3: Estimation output (click to enlarge)
This individual estimation gives us a single forecast for GDP based upon the most current data available. Specifically, this estimation uses data up to:
  • 2018Q2 for our dependent variable.
  • 2018Q1 for our quarterly independent variables (since they are all lagged one period).
  • May 30th for our daily independent variables (a one day lag from the last day of Q2). Also note that since we are using 45 daily periods for each quarter, the 2018Q2 data point is estimated using data from March 29th - May 30th (we are dealing with regular 5-day data).
From this equation we can then produce a forecast of the 2018Q3 value of GDP by clicking on the Forecast button:
Figure 4: Forecast dialog (click to enlarge)
This single quarter forecast uses data from:
  • 2018Q2 for our quarterly independent variables (since they are all lagged one period).
  • July 30th 2018 - September 28th 2018 for our daily independent variables (45 days ending on the last day of Q3 2018 - September 29th/30th are a weekend, so not included in our workfile).
To produce an updated forecast the following day, we could re-estimate our equation using the same data, but with the daily independent variables shifted forwards one day (removing the one day lag on their specification), and then re-forecasting.

Or, if we wanted an historical view on how our forecasts would have performed previously, we can re-estimate for the previous day (shifting our daily variables back by one day by increasing their lag to 2) and then re-forecast.

Indeed we could repeat the historical procedure going back each day for a number of years, giving us a series of daily updated forecast values. Performing this action manually is a little cumbersome, but an EViews program can make the task simple. A rough example of such a program may be downloaded here.

Once the series of daily forecasts is created, you can produce a good picture of the accuracy of this procedure:
Figure 5: Daily updated forecast of Australian GDP Trend (click to expand)



1 Such as consumer or business surveys
2 Such a retail spending, housing or labour market data
3 As GDP, for example, is essentially an accounting identity that represents the sum of different income, expenditure or production measures, it can be calculated using a ‘bottom-up’ approach in which series that proxy for the various components of GDP are used to construct an estimate of it using an accounting type approach.
4 Bridge equations are regressions which relate low frequency variables (e.g. quarterly GDP) to higher frequency variables (eg, the unemployment rate) where the higher frequency observations are aggregated to the quarterly frequency. It is often the case that some but not all of the higher frequency variables are available at the end of the quarter of interest. Therefore, the monthly variables which aren’t as yet available are forecasted using auxiliary models (eg, ARIMA).
5 Papers using a daily frequency in mixed frequency regression analyses include Andreou, Ghsels & Kourtellos, 2010, Tay, 2006 and Sheen, Truck & Wang, 2015.
6 MIDAS models use distributed lags of explanatory variables which are sampled at an equivalent or higher frequency to the dependent variable. A distributed lag polynomial is used to ensure a parsimonious specification. There are different types of lag polynomial structures available in EViews. Lindgren & Nilson, 2015 discuss the forecasting performance of the different polynomial lag structures.
7 See here and here for background and here and here for how to do in EViews.
8 For example, underlying data on a monthly and quarterly basis will generate a common factor that is on a quarterly basis. This should therefore go on a quarterly workfile tab.
9 For example, if there was an NA then you could choose to use the previous value for the latest date instead. For example, X_full series = @recode(X =na, X(-1), X)

3 comments:

  1. I have been usignoring EViews to model MIDAS regressions and I have absolutely feLlandudno in love. Thank you very much for this post, it shines a lot of light.

    A question:

    The p-values associated with the coefficients of the three PDL polynomial degrees, does their level of significance matter in forecasting?

    Moreover, gaussian properties of the residuals, is it a must that they be satisfied in this technique before forecasting can take place?

    Thank you again for a wonderful post!

    ReplyDelete
    Replies
    1. It is often argued that p-values and normality are irrelevant to forecasting in OLS, so that goes for MIDAS too.

      Delete