Monday, September 11, 2023

Nowcasting US GDP During Covid-19 using Factor Augmented MIDAS

The COVID-19 pandemic sent waves through the global economy, triggering a macroeconomic shock and caused unprecedented challenges for economists trying to predict the current state of economies.

In the quest for a more timely and accurate assessment of economic conditions during the COVID-19 era, economists and researchers turned to innovative solutions, and one of the most promising techniques emerged: MIDAS (Mixed-Data Sampling) estimation.

MIDAS, originally developed in the early 2000s, has gained attention as a powerful tool to nowcast GDP with a higher frequency, enabling more informed and timely decision-making.

We have covered nowcasting and MIDAS with EViews before on this blog. We’ve demonstrated how the novel MIDAS-GETS approach can be used in conjunction with PMI data to accurately nowcast Eurozone GDP, and we’ve shown how many daily series can be reduced down to a small set of variable using principle components and then MIDAS to nowcast Australian GDP.

This blog post is similar to the latter post above – we will use a large number of high frequency variables to nowcast GDP through a combination of variable reduction and MIDAS estimation. Specifically, we will use the FRED-MD monthly data bank of US macroeconomic variables to forecast US GDP, using a Factor Augmented MIDAS model.

Table of Contents

  1. Introduction
  2. Data
  3. Nowcasting 2020Q2 GDP
  4. Longer Term Nowcast Evaluation
  5. Files
  6. References

Introduction

An introduction to the background of MIDAS and its benefits can be found in our previous blog post. In this post we’ll be performing Factor Augmented MIDAS (FA-MIDAS), which is an extension to the standard MIDAS technique. FA-MIDAS was introduced in Marcellino and Schumacher (2010), and has been used in a number of different studies, including Gül and Kazdal (2021), and Ferrara and Marsilli 2018.

One of the downsides of traditional MIDAS is that it is unable to handle large numbers of high-frequency regressors. Indeed, it is often recommended that only a single high-frequency regressor be used. With today’s abundance of data, economists are faced with a large set of high-frequency regressors to choose from, and reducing the number of variables down to a single or small number of variables is a daunting task.

Factor analysis is able to reduce the dimensionality of the regressors by identifying correlations amongst them and use those correlations to create a small set of latent factors which contain similar information to the set of original variable.

The FA-MIDAS approach is then to first use factor analysis to reduce the large number of high-frequency variables to a handful of latent factors, and then use those high-frequency factors as regressors in a MIDAS regression to model a lower frequency variable.



Data

The Saint Louis Federal Reserve maintains a large database of monthly US macroeconomic variables, FRED-MD. The database contains 127 variables that are updated each month and made available in a single .CSV file. Archival versions of the database are also made available, meaning you can download the data as released during a specific month (i.e. not containing any revisions made since that date). The database also contains an appendix that specifies a suitable transformation that should be performed on each series prior to use in analysis. The transformations include first and second differences, logs, and first and second log differences, as well as simply no transformation.

As well as using this database, we will access FRED’s quarterly US GDP data, which can also be retrieved on an archival basis – using the values that were available on a certain date in history.



Nowcasting 2020Q2 GDP

We will imagine we are in June 2020, a few months after the initial surge in Covid numbers in the United States. This is the last month of the second quarter of 2020, and as such we would not have any official data on GDP for that quarter yet. However, we would have US data from the FRED-MD database up until May 2020. That means we have two months of macro-economic data post Covid-19 shutdowns that started in March 2020, yet no data for GDP itself during Covid.

We’ll walk through the steps taken to produce a nowcast of GDP based upon the monthly data.

To begin we will instruct EViews to download the FRED-MD database for June 2020. We can download this file manually from the FRED-MD website:


Figure 1: Summary of FRED-MD Data

However, since the database is a simple .CSV file, we can instruct EViews to open the file directly from the internet with a wfopen command:

wfopen https://files.stlouisfed.org/files/htdocs/fred-md/monthly/2020-06.csv
We simply follow wfopen with the url of the file, and then add two arguments to describe the data – the first tells EViews that there are two rows of headers at the top of the file (the name of the series, and the transformation), and that of those two rows of headers, the name is in the first row, followed by series attributes (in our case the transformation).


Figure 2: Workfile (Summary)

The one issue with this import is that although the first column in the CSV file, sasdate, contains dates, EViews did not recognize the file as being dated. This is because the CSV file contains a row of blank information at the end, and EViews will not recognize the blank. The issue is easily rectified by clicking on Proc->Structure/Resize Current Page, and then changing the Workfile structure type to Dated – specified by date series and entering sasdate as the Date series:


Figure 3: Workfile (Open)

EViews will then warn us about removing one observation from the workfile (the blank row), but after confirming that’s what we want to do, we end up with a nicely structured monthly workfile containing all 127 variables between 1959 and May 2020.


Figure 4: Workfile (Monthly)

Since we imported the Transformation row of the CSV file as an attribute (with the
namepos=firstatt
argument to the
wfopen
command), each series also contains meta data on the type of transformation recommended. We can view these by using the Details +/- button on the workfile, and then adding the transformation column by right clicking on any column header and selecting Edit columns.


Figure 5: DWorkfile (Transform)

There is no point-and-click method in EViews to automatically apply the transformations to all the series at once. However, we can make a simple program that loops through each series, pulling the transformation type from its attributes, and then applying the transformation to itself:

        'perform transformations
        %serlist = @wlookup("*", "series")
        for %j {%serlist}
            %tform = {%j}.@attr("Transform:")
            if @len(%tform) then
                if %tform="1" then
                    series temp = {%j}  'no transform
                endif
                if %tform="2" then
                    series temp = d({%j})  'first difference
                endif
                if %tform="3" then
                    series temp = d({%j},2) 'second difference
                endif
                if %tform="4" then
                    series temp = log({%j}) 'log
                endif
                if %tform= "5" then
                    series temp = dlog({%j}) 'log difference
                endif
                if %tform= "6" then
                    series temp = dlog({%j},2)  'log second difference
                endif
                if %tform= "7" then
                    series temp = d({%j}/{%j}(-1) -1)  'other
                endif
            
                {%j} = temp
                d temp		
            endif
        next
    


Some of the series only are missing some data over the last year. We will want to drop these from our analysis – we’d prefer to only use series with completely up-to-date data. We’ll make another quick loop to add series to a group based on whether they have an observation for the last year or not.

        %serlist = @wlookup("*", "series")   'get list of series
        smpl @last-11 @last    'set sample to last year of observations

        group g  'declare a group
        for %j {%serlist}   'loop through series
            if @obs({%j})=12 then   'if series has values for every observation in last year
                g.add {%j}  'add it to group
            endif
        next
        smpl @all   'reset sample to everything
    


Now we are ready to perform the factor analysis on our group. We can do this by opening the group we created, G, and then clicking on Proc->Make Factor to bring up the Factor Specification dialog. There are lots and lots of options that can be specified when performing Factor Analysis in EViews, but we’ll keep most of them at their default values. The only change we will make is changing the Number of factors option to use the Ahn and Horenstein methods (this method tends to result in fewer factors than other methods, which is useful when performing MIDAS estimation).


Figure 6: Dialog (Factor Specification)

In this case, the analysis resulted in a single factor being created. We can output this factor as a series into the workfile by clicking on Proc->Make Scores, and then clicking OK.

This produces a new series, F1 in our workfile, which is the series we will use as the high-frequency regressor in the MIDAS estimation.

Before we move on to working with the low-frequency data, we’ll quickly give our monthly page a more descriptive name than the default “Untitled”, by right clicking on the page tab, selecting Rename Workfile Page… and then entering Monthly as the new name.

To set up our quarterly GDP data, we click on the New Page tab and select Specify by Frequency/Range… We’ll then select a Quarterly frequency, and change the start date to 1992 (although we have data for our monthly variables before this date, we’ll cut down the amount of data actually used in estimation). We’ll call the page “Quarterly”.


Figure 7: Dialog (Workfile Create)

Once the page has been created, we’ll open the FRED database (File->Open->Database->FRED), browse and search for GDP, change the As Of: date to 2020-06-01, and drag the Real GDP series into our workfile. This series contains Real US GDP data as it was available in June 2020 (not as it is available today). EViews will ask if we want to change the name (since the source name is illegal in EViews). We’ll change it to GDP.


Figure 8: Workfile (Quarterly)

If we open the GDP series we can see the final value, for 2022Q2, is an NA – that value of GDP had not been released yet on June 1st 2020.


Figure 9: Series (GDP)

We’re now ready to perform our MIDAS estimation, which we do by clicking on Quick->Estimate Equation, and then change the Method dropdown to MIDAS. It is common for models of GDP to use percent change of GDP as the dependent variable, with a constant and a single lag of percent change of GDP as quarterly regressors. We can compute the percent change using the
@pch
function in EViews.

The specification of the high-frequency regressor requires a little thought. We wish to use the monthly series F1 (which was the factor series we created earlier) as our high-frequency regressor. We have data on F1 until May 2020, which is the second month of Q2 2020. When converting between high frequency and lower frequency data during MIDAS estimation, EViews will, by default, use the last observation in the quarter, and work backwards in time from there. In our case that would be June 2020, but in the monthly page we created, this month doesn’t exist (the monthly page ends in May 2020). The easiest way to fix this is to change the Frequency conversion setting of the MIDAS estimation on the Options tab of the estimation dialog to First. Now EViews will use data from the first month in the quarter instead of the last.

This will enable us to produce an estimation. However, we would actually be losing some information – we would use data from April 2020 (April is the first month of the quarter) and earlier, and drop the information in May 2020. We can alleviate this by entering our monthly regressor as monthly\F1(1) where the (1) indicates to shift the data one month on from the first of the month. We’ll select to use 12 monthly lags (a full year) of the F1 series.

This means, for example, that the GDP data for 2019Q3 would be explained by GDP data for 2019Q2 (the one period lag in GDP), a constant, and monthly data for F1 from September 2018 through August 2019 (the second month of 2019Q3).


Figure 10: Dialog (Equation Estimation)


Figure 11: Dialog (MIDAS Frequency Conversion)

The results of the estimation are:


Figure 12: Estimation Results

We can see that the three MIDAS PDL coefficients are all statistically significant. Also note that EViews automatically adjusted the estimation sample to end in 2020Q1 (which is the last quarter for which we have GDP data).

To nowcast 2020Q2 GDP, all we have to now do is click the Forecast button and set the forecast sample to 2020Q2 2020Q2 (same start date and end date means to forecast a single period).


Figure 13: Dialog (Forecast)

Note that although the equation is specified in terms of percent-change of GDP, we will forecast the raw values of GDP, not the percent change, and the forecast values will be put into the series GDPF. Since we have the Insert actuals for… checkbox checked, the series will contain the actual values of GDP for the non-forecast periods (i.e. every quarter other than 2020Q2). After clicking OK, we can open the GDPF series as a graph:


Figure 14: Nowcast GDP

We can see that the forecasted (shaded) area of the graph has a sharp decline in GDP, which matches the economic expectations.

We can go further and actually gauge how good this nowcast of 2020Q2 GDP is by retrieving the actual values of GDP for that period. We do so by again opening the FRED database, and changing the As of: date to be August 2020, and then drag the GDP series back into EViews. We’ll keep the name as that suggested by EViews, to recognize that the data are as-of August 2020. We can open this series alongside the nowcasted series in a group and view the graph, using the graph slider to zoom into the last few quarters of the data:


Figure 15: Nowcast GDP vs Actual

We can see that the nowcast value (blue line) very closely matches the actual value (orange) for 2020Q2!



Longer Term Nowcast Evaluation

In the previous section we walked through how we could nowcast a single quarter of GDP, and showed that the nowcasted value was very close to the first release of the actual data. As a single result, this doesn’t tell us conclusively that the nowcasting model is always an accurate predictor. For that we would need to perform a series of nowcasts over a longer period of time and compare the results from the series of nowcasts to the actual data.

Performing such a study in EViews is relatively straightforward through the EViews programming language. We’ve written such a script that nowcasts GDP between January 2017 and July 2023. We won’t go through each step of the script, but will describe its functionality.

Data Retrieval

The program loops through each month between 2017 and today’s date. For each of those months, it downloads the FRED-MD file for that month into a new page in the workfile. Thus each month will have its own page containing FRED-MD data from 1991 until the month prior (since, for example, the FRED-MD file for 2020-06 contains data from 1991 until 2020-05).

For each month in the loop the program will also download quarterly GDP from FRED to a quarterly page, where the GDP data are as of the first of the month (and so will contain data up until the quarter prior to the quarter of the current month, or maybe even the quarter before that, depending on how long the delay of the official release of the GDP data is).

Estimation

With the data retrieved, for each month in the loop, a factor model is estimated on the monthly FRED-MD dataset (having removed any series that do not contain data for the previous two years), and the estimated factors are outputted to the monthly page. Then, for the corresponding quarterly GDP as-of that month, a MIDAS model is estimated, with percent-change GDP as the dependent variable, a constant and a lag of percent-change GDP as regressors, and a Almon/PDL weighted MIDAS term using 12 lags of the factor series and a polynomial degree of 3. For each of the months, the frequency conversion is set to “first”, and the factor series are shifted forwards to allow capture of the most data (as was the case for the single estimation we performed earlier).

At the same time a baseline comparison model of a simple AR(1) model for GDP is also estimated (i.e. simply percent-change GDP regressed against a constant and a lag).

Nowcasting

For both the MIDAS estimation and the baseline AR model, a one-period ahead nowcast is made, and the two values for that single quarter are stored.

After the program has looped through every month, there will be three nowcasts for each quarter, for each of the two models. The first nowcast will correspond to data as-of the first month of the quarter, the second nowcast will correspond to data as-of the second month of the quarter, and the third nowcast will correspond to data as-of the third month of the quarter.

These nowcasts are stored in a monthly page, giving a month-by-month updated nowcast of quarterly GDP through time.

We will also copy those nowcasts over to the quarterly page, taking the average of the three months forecast for each quarter.

Results

The graph below shows a time-series of the monthly nowcasts generated by the FA-MIDAS model, the AR(1) model, alongside actual GDP data as-of first release:


Figure 16: Nowcast GDP Comparison (Monthly)

We can see that the MIDAS nowcast has a large degree of fluctuation during the COVID period, which is undoubtedly due to the instabilities in the economy at the time. In comparison to the AR(1) model, though, it does correctly time the sharp decrease in GDP at the start of 2020, even if it does overshoot dramatically.

Looking at the quarterly version of the same graph, we can see that the MIDAS approach matches actual GDP very closely for the first year of COVID, but again fluctuates a little too much.


Figure 17: Nowcast GDP Comparison (Quarterly)

The script also produces a forecast evaluation table of the two forecasts, and a simple average of the two:


Figure 18: Nowcast GDP (Forecast Evaluation)

The average of the two forecasts performs best, but out of the two, the FA-MIDAS model produces the more accurate forecasts.

Files




References

  1. Ferrara, L., & Marsilli, C., (2018). Nowcasting global economic growth: A factor‐augmented mixed‐frequency approach. The World Economy.
  2. Gül, E., & Kazdal, T., (2021). COVID-19 pandemic, vaccination and household expenditures: regional evidence from Turkish credit card data Applied Economics Letters, 1-4.
  3. Marcellino, M., & Schumacher, C., (2010). Factor MIDAS for nowcasting and forecasting with ragged‐edge data: A model comparison for German GDP Economics Statistics, 72.4: 518-550.

No comments:

Post a Comment