The Economics of Recession: A Survey
Part 7/10, October 2022
[1]
Arturo Estrella
7. Forecasting recessions
Predicting real economic activity is very difficult in general, and predicting the timing of business cycle turning points and recessions is more difficult still. Judgmental forecasts based on data analysis and econometric forecasts from empirical models – large and small – have been used for a long time but with only limited success.[2]
Various lines of research since the late 1980s have focused on the use of financial asset prices to forecast real activity. Financial assets involve future payments and their prices are in principle set by forward-looking market participants who take expected future economic conditions into account. Thus, financial asset prices may be viewed as containing implicit forecasts that may be retrieved with the right procedures. Research tends to confirm this interpretation, though accuracy is not always guaranteed. As economist Paul Samuelson quipped in 1966, the stock market “predicted nine out of the last five recessions!”
Stock and Watson [R41] offer a new method to improve on previous recession forecasts by applying state-of-the-art econometric techniques. The main purpose of their research, sponsored and published by the NBER, was to construct indexes of coincident and leading indicators for the U.S. economy. The underlying premise is that there is a single index of economic activity that can be used to date the business cycle, but instead of assuming that the index is a specific observable variable like GDP, industrial production, or employment, they treat the index as unobservable and infer its value from multiple macroeconomic time series using a dynamic factor model. The result is a single coincident monthly index for the United States constructed from data for industrial production, personal income, sales, and employee hours.
The Stock-Watson leading index is defined as the optimal forecast of the growth rate of the coincident index over a 6-month horizon. The leading index is computed from seven variables with statistically-determined relative weights: three real-economy variables (housing permits, unfilled orders, and part-time work) and four financial market variables (the dollar exchange rate, the 10-year Treasury yield, the spread between 6-month commercial paper and Treasury bill rates, and the spread between the 10-year and 1-year Treasury yields). To forecast recessions, the article first applies a nonparametric rule to the single coincident index to construct a recession indicator that approximates NBER recession dates. The recession index is in the form of a probability of recession six months ahead, derived from the statistical estimates by numerical integration, conditioning on the observed values of the coincident and leading indicators.
The recession forecasting model performs reasonably well within sample. However, as the authors indicate, “Overfitting the data (and the consequent poor out-of-sample performance) is a risk in any empirical exercise, and the danger is particularly clear here.” The article discusses some pseudo out-of-sample simulation experiments, but a true real-time test of the model came very soon afterwards with the 1990-91 recession. The results of the model for this recession were less than satisfactory, which led to a thorough reformulation of the leading indicator index in a follow-up article, which eliminated most of the financial variables from the index.[3] The coincident and leading indexes were eventually retired in December 2003, but the basic principle of forecasting recessions with leading indicators in a formal statistical model remained influential in the subsequent literature.
In the late 1980s, a single bond market variable received much attention from economists as a leading indicator of recessions. The variable is computed from yields on U.S. Treasury securities and is known variously as the slope of the yield curve or the term spread. It is calculated simply by taking the difference between a long-term Treasury yield (on a bond with maturity between 10 and 30 years) and a short-term Treasury yield (typically on a 3-month bill). The term spread is easy to obtain quickly in real time at any data frequency, from intra-day to quarterly or beyond, and evidence that it can be used to forecast recessions accurately started soon to accumulate.
Estrella and Hardouvelis [R42] use the term spread (10-year minus 3-month yields) to forecast various measures of real economic activity including real GNP, its components, and NBER-dated recessions. Stock and Watson include a similar spread (10-year minus 1-year yields) in [R41] as one of many variables used jointly to forecast recessions. In contrast, this article uses the term spread as a single predictive variable in a formal forecasting model structured as a probit equation. Results are strong and robust across the various measures of real activity, except for government spending. The authors suggest that the term spread is a good predictor of real growth at predictive horizons between 1 and 7 quarters.
With regard to recessions, the analysis finds statistically significant results with a 4-quarter predictive horizon, for which the power to forecast real growth is also strong. The longer predictive horizon, as compared with [R41], seems to make better use of the information contained in the term spread, and the parsimony of using a single variable addresses the warning from Stock and Watson about the dangers of overfitting. The recession model has proved to be very durable, as some of the subsequent readings show.
Friedman and Kuttner [R43] propose a different financial spread variable as a leading indicator of recessions, namely the difference between 6-month commercial paper and 6-month Treasury bill rates. Like the term spread, the paper-bill spread is one of the seven components of the original Stock-Watson leading index, but this article investigates its properties as a single predictor of real economic growth. The analysis finds that the variable is strongly related to subsequent real growth in single- and multiple-equation models. There is no explicit recession prediction model, but the article compares the average value of the spread over the full monthly data sample from 1959 to 1990, which is 0.57 percent, with its value in months 1 to 6 prior to recessions during the period, which is 0.88 percent. This result suggests that higher-than-normal values of the paper-bill spread may be used to predict recessions over a 6-month horizon. Consistency over time has been an issue for this leading indicator. For example, it did not rise in anticipation of the 1990-91 and 2001 recessions, although it gave a very clear signal before the 2008-09 recession.
The Estrella-Hardouvelis recession forecasting model from [R42] is applied by Estrella and Mishkin [R44] to four European countries and again to the United States. The model uses the term spread for each country to forecast recessions with a 4-quarter horizon, defining the spread as closely as possible to the variable that proved to be successful in the United States. Empirical estimates cover a shorter sample from 1973 to 1994 in order to have consistent data for all countries, but the term spread was statistically significant for all the European countries (France, Germany, Italy, and the United Kingdom) as well as for the United States. For the United Kingdom and the United States, quasi-official indexes of leading indicators were also available and their predictive performance was compared with that of the term spread. The spread was shown to contain independent information in both cases, as did the U.K. leading index, but the U.S. leading index did not.
In Estrella and Mishkin [R45], the term spread is pitted against a large number of competing leading indicators, including interest rates and spreads, monetary aggregates, leading indicator indexes, and individual components of the leading indexes. The strategy is to apply the Estrella-Hardouvelis recession prediction model to each individual variable and to combinations of variables, and compare the results in both in-sample and pseudo out-of-sample experiments. Estimates use quarterly U.S. data from the first quarter of 1959 to the first quarter of 1995 and results are evaluated primarily on the basis of the Estrella (1998) pseudo R-squared, which is related monotonically to the joint test statistic for significance of the explanatory variables in each model and takes on values interpretable as in the scaling of the ordinary R-squared.[4] Predictive horizons from 1 to 8 quarters ahead are examined.
Within sample, a few of the variables are significant for forecast horizons up to four quarters or slightly more, including the term spread, the monetary base, stock prices, and the Stock-Watson leading index. The paper-bill spread is significant, but only one quarter ahead. The term spread has the strongest results overall, with an R-squared of 30 percent at 4 quarters as compared with 17 percent for the monetary base and 10 percent for the Stock-Watson index over the same horizon. When each of the other variables is included with the term spread in the model, they are generally not significant, with the exception of the stock market at horizons of up to 3 quarters. Out-of-sample results are fairly stark. One quarter ahead, the above variables all perform well, except for the paper-bill spread. The Stock-Watson index leads the group at this horizon with an R-squared of 32 percent. At 2 or more quarters ahead, the term spread dominates. Out-of-sample performance at 4 quarters is about the same as in-sample and the only other variable that helps forecast out of sample is the stock index, but only up to 3 quarters ahead.
Bernard and Gerlach [R46] explore the international dimension further. First, they estimate the Estrella-Hardouvelis model for a larger sample of 8 countries,[5] and they consider the effects of adding the term spread from a different country to the domestic spread in the model. Predictive horizons from 0 to 8 quarters are evaluated and the data sample timeline extends from 1972 to 1993. Results show that the domestic term spread is significant at the 5 percent level in many cases, particularly from 2 to 4 quarters ahead. The only exception is the Netherlands, for which the best performance is registered at the 2-quarter horizon and is significant only at the 10 percent level.
The results of adding either the U.S. or the German term spread to the domestic spread in the equations for the other countries are interesting and suggestive for further research. The U.S. spread is strongly significant for the United Kingdom and mildly significant for Canada, but not for the other countries. The German spread, on the other hand, is strongly significant for Canada, Japan, and the United States. This pattern might be related to the openness of economies or to relationships in international trade and finance, but further work must be done to try to investigate those connections. In the case of the German spread in the model for Japan, the authors hypothesize that simultaneity of recessions may be at play rather than a causal connection.
The wealth of significant results from the yield curve recession models suggests an important role for the term spread in expectations formation for business and government. However, how stable and robust are the results over time and across policy regimes? Estrella, Rodrigues, and Schich [R47] examine the stability of the recession prediction models in Germany and the United States using comparable monthly data for the respective countries from 1967 to 1998. In particular, they test econometrically for unknown breakpoints in the models as well as for breakpoints at particular critical times such as the appointment of Chairman Volcker at the Federal Reserve in 1979. The analysis fails to uncover any evidence of breaks in the recession prediction models for either country at any time during the period. For models that predict industrial production growth rather than recessions, the unknown breakpoint tests uncover mild evidence (at the 10 percent level) of a break in the U.S. model as of September 1983.
Empirical research on forecasting recessions using the term spread marched full speed ahead for more than a decade without waiting for economic theory to catch up. Harvey (1988) had explored a theoretical model in which the real term spread and the real short-term rate (both adjusted for expected inflation) may be used jointly to forecast real consumption growth.[6] This result, while suggestive, differs from the empirical models in three important ways. First, the model forecasts consumption growth rather than recessions, which are defined with reference to the whole economy. Second, the model uses the real term spread as predictor rather than the observable nominal term spread used in most of the empirical literature. The two would be interchangeable only if expected inflation were constant, which seems unlikely. Third, the Harvey model also requires inclusion of the real short-term rate as a joint predictor rather than just the term spread as in the empirical models.[7]
Estrella [R48] constructs a theoretical model whose implications closely parallel the empirical predictive relationship between the term spread and recessions. In the model, the nominal term spread is the optimal predictor of the gap between actual and potential real aggregate output. If output starts at potential, a negative value of the spread is equivalent to a forecast of output falling below potential the following year, in other words of a recession. The model suggests that the predictive connection between the term spread and real economic activity is driven both by optimal expectations and by a monetary policy rule similar to the Taylor rule of [R20]. Changes in the parameters of the monetary policy rule may alter the precise values of the parameters of the predictive model, but the relationship between a negative spread and an expected recession is robust to these changes. This feature of the model is consistent with the empirical results of the [R47] reading.
Duarte, Venetis, and Paya [R49] return to the application of the probit term spread recession model to European economies, but with a couple of twists. First, they consider recessions for the euro area in the aggregate. Recessions are dated by a nonparametric method, a variant of the two-negative-quarters rule, which is applied to aggregate growth for the euro area for the period from 1970 to 2000. Interest rates are quarterly averages of 10-year and 3-month rates for each country, combined using purchasing-power-parity weights. The second twist in this article is that the model may contain one of two types of structural breaks. Empirical results show that the model performs well with or without the structural breaks, but that the versions of the model that allow for breaks perform best in pseudo out-of-sample experiments.
The term spread recession model is conceptually very simple, but its application in practice from start to finish contains many technical details that may affect the results. For instance, how is the dependent variable of the model – the recession index – defined? The NBER does not provide the recession variable directly and some definitional decisions are required to produce it. Also, how is the short-term rate computed? Ideally, the short-term rate should be computationally compatible with the long-term bond rate, but bill rates are typically expressed on a different basis from bonds and must be adjusted. A frequent pitfall for the financial media is that they issue reports about yield curve inversions (short-term rate above the long-term rate) using the 2-year rate as the short-term benchmark. The research that points to yield curve inversion as a benchmark is based on the 3-month short-term rate, and indiscriminate substitution of the 2-year rate tends to lead to earlier and longer-lasting inversions, resulting in a biased signal. To clarify the decisions made in earlier research to and facilitate replication, Estrella and Trubin [R50] go carefully over the methodological details of the term spread recession model and its components.
Several researchers have retained the basic form of the recession prediction equation, but have explored the inclusion of different variables that may have predictive power for recessions. Christiansen, Eriksen, and Møller [R51], for example, consider survey measures of consumer and business sentiment. Using monthly U.S. data from 1978 to 2011, they estimate alternative probit equations containing the sentiment variables by themselves, together, and jointly with other predictors such as the term spread, short-term interest rates, the stock market, and factors drawn from a large number of macroeconomic variables. The results suggest that the sentiment variables contain useful forecasting value even when alternative predictive variables are included in the equation.
Bluedorn, Decressin, and Terrones [R52] examine the predictive power of falling asset prices using the basic form of the recession prediction model. In this case, the model is estimated in a logit rather than a probit equation, but this difference in functional specification is unlikely to produce very different results in general. The three main asset price variables used as predictors in the models are stock price growth, housing price growth, and stock price volatility. Using quarterly data for the G-7 group of industrial countries from 1970 to 2011, the article finds support for all three predictors, even when the term spread and oil price growth are included in the equations. The article also draws a distinction between predicting the start of a recession and predicting its continuation, and provides evidence that the asset price variables are most helpful in predicting the start.
Other research maintains the focus on the term spread as a predictor, but employs different econometric methods. Chauvet and Potter [R53] use Bayesian estimation methods that provide greater modelling flexibility. In particular, they allow the innovation variance to change over the business cycle and include an autoregressive component. Using monthly U.S. data from 1955 to 2000, they conclude that the model has a better in-sample fit than the basic probit model. They also use the 2001 recession as a case study for the implications of the alternative models estimated in the article. In this case, the models without an autoregressive component have the better performance.
Kauppi and Saikkonen [R54] work with dynamic binary response models, which are similar to probit and logit but allow for greater flexibility to introduce lags of the dependent and independent variables. Various dynamic models as well as the basic probit model are estimated with quarterly U.S. data from the fourth quarter 1955 to the fourth quarter 2005. In the empirical results, the more flexible dynamic specifications outperform the standard probit model in-sample. To guard against the possibility of overfitting, the article executes pseudo out-of-sample experiments that also give the edge to the dynamic models.
Österholm [R55] departs from the earlier literature with respect to both the econometric method and the predictive variables selected. The model tested is a Bayesian VAR, which combines the features of vector autoregression with the ability of Bayesian models to evaluate probabilities of events involving the variables of the model. The variables included in the VAR are real GDP growth, inflation, a short-term rate, consumer sentiment, business sentiment, oil price changes, and a measure of tightening of bank lending. This model is estimated using quarterly U.S. data from 1982 to 2008. The main conclusion is that the model performs poorly with regard to forecasting the 2008-09 recession, which leads to a suggestion to stay with the models that predict recessions well, such as the basic term spread model.
As noted earlier, professional forecasters do not exactly have an illustrious record of predicting recessions. Rudebusch and Williams [R56] present historical evidence of that record obtained from the U.S. Survey of Professional Forecasters and compare the performance of the survey with the simple term spread model from 1968 to 2007. The article points to an enduring puzzle that may be formulated in the form of a question: if the term spread performs so well historically and professional forecasters presumably use all available information, why does the term spread consistently outperform the survey? Perhaps another survey is needed to answer this question.
Readings referenced from book The Economics of Recession
R41 James H. Stock and Mark W. Watson (1989), ‘New Indexes of Coincident and Leading Economic Indicators’, in Olivier Jean Blanchard and Stanley Fischer (eds), NBER Macroeconomics Annual 1989, Volume 4, Cambridge, MA, USA: MIT Press, 351–94
R42 Arturo Estrella and Gikas A. Hardouvelis (1991), ‘The Term Structure as a Predictor of Real Economic Activity’, Journal of Finance, 46 (2), June, 555–76
R43 Benjamin M. Friedman and Kenneth Kuttner (1993), ‘Why Does the Paper-Bill Spread Predict Real Economic Activity?’, in James H. Stock and Mark W. Watson (eds), Business Cycles, Indicators and Forecasting, Chapter 5, Chicago, IL, USA: University of Chicago Press, 213–53
R44 Arturo Estrella and Frederic S. Mishkin (1997), ‘The Predictive Power of the Term Structure of Interest Rates in Europe and the United States: Implications for the European Central Bank’, European Economic Review, 41 (7), July, 1375–1401
R45 Arturo Estrella and Frederic S. Mishkin (1998), ‘Predicting U.S. Recessions: Financial Variables as Leading Indicators’, Review of Economics and Statistics, 80 (1), February, 45–61
R46 Henri Bernard and Stefan Gerlach (1998), ‘Does the Term Structure Predict Recessions? The International Evidence’, International Journal of Finance and Economics, 3 (3), July, 195–215
R47 Arturo Estrella, Anthony P. Rodrigues and Sebastian Schich (2003), ‘How Stable is the Predictive Power of the Yield Curve? Evidence from Germany and the United States’, Review of Economics and Statistics, 85 (3), August, 629–44
R48 Arturo Estrella (2005), ‘Why Does the Yield Curve Predict Output and Inflation?’, Economic Journal, 115 (505), July, 722–44, A1–A2
R49 Agustin Duarte, Ioannis A. Venetis and Iván Paya (2005), ‘Predicting Real Growth and the Probability of Recession in the Euro Area Using the Yield Spread’, International Journal of Forecasting, 21 (2), April–June, 261–77
R50 Arturo Estrella and Mary R. Trubin (2006), ‘The Yield Curve as a Leading Indicator: Some Practical Issues’, Federal Reserve Bank of New York: Current Issues in Economics and Finance, 12 (5), July/August, 1–7
R51 Charlotte Christiansen, Jonas Nygaard Eriksen and Stig Vinther Møller (2014), ‘Forecasting U.S. Recessions: The Role of Sentiment’, Journal of Banking and Finance, 49, December, 459–68
R52 John Bluedorn, Jörg Decressin and Marco E. Terrones (2016), ’Do Asset Price Drops Foreshadow Recessions?’, International Journal of Forecasting, 32 (2), April–June, 518–26
R53 Marcelle Chauvet and Simon Potter (2005), ‘Forecasting Recessions Using the Yield Curve’, Journal of Forecasting, 24 (2), March, 77–103
R54 Heikki Kauppi and Pentti Saikkonen (2008), ‘Predicting U.S. Recessions with Dynamic Binary Response Models’, Review of Economics and Statistics, 90 (4), November, 777–91
R55 Pär Österholm (2012), ‘The Limited Usefulness of Macroeconomic Bayesian VARs When Estimating the Probability of a U.S. Recession’, Journal of Macroeconomics, 34 (1), March, 76–86
R56 Glenn D. Rudebusch and John C. Williams (2009), ‘Forecasting Recessions: The Puzzle of the Enduring Power of the Yield Curve’, Journal of Business and Economic Statistics, 27 (4), October, 492–503
[1] The original version of the survey was published in The Economics of Recession, Edward Elgar Publishing, 2017.
[2] Some of the readings cited in this section, in particular [R56], provide comparative evidence of the record of professional forecasters.
[3] James H. Stock and Mark W. Watson (1993) ‘A Procedure for Predicting Recessions with Leading Indicators: Econometric Issues and Recent Experience’, in James H. Stock and Mark W. Watson (eds.) Business Cycles, Indicators and Forecasting, Chicago, IL, USA: University of Chicago Press.
[4] Arturo Estrella (1998) ‘A New Measure of Fit for Equations With Dichotomous Dependent Variables’, Journal of Business and Economic Statistics, 16 (2), April, 198-205.
[5] Belgium, Canada, France, Germany, Japan, the Netherlands, United Kingdom, and United States.
[6] Campbell R. Harvey (1988) ‘The Real Term Structure and Consumption Growth’, Journal of Financial Economics, 22, 305-33.
[7] Technically, empirical estimates in Harvey (1988) reject the proposed theoretical model. Specifically, the coefficients of the two predictive variables are shown to be the same in theory, but in the estimates they consistently have opposite signs. The article does not report the significance level of these rejections.