Thèse de doctorat
Résumé : Over the last years, the growing availability of large datasets and the improvements in the computational speed of computers have further fostered the research in the fields of both macroeconomic modeling and forecasting analysis. A primary focus of these research areas is to improve the models performance by exploiting the informational content of several time series. Increasing the dimension of macro models is indeed crucial for a detailed structural understanding of the economic environment, as well as for an accurate forecasting analysis. As consequence, a new generation of large-scale macro models, based on the micro-foundations of a fully specified dynamic stochastic general equilibrium set-up, has became one of the most flourishing research areas of interest both in central banks and academia. At the same time, there has been a revival of forecasting methods dealing with many predictors, such as the factor models. The central idea of factor models is to exploit co-movements among variables through a parsimonious econometric structure. Few underlying common shocks or factors explain most of the co-variations among variables. The unexplained component of series movements is on the other hand due to pure idiosyncratic dynamics. The generality of their framework allows factor models to be suitable for describing a broad variety of models in a macroeconomic and a financial context. The revival of factor models, over the recent years, comes from important developments achieved by Stock and Watson (2002) and Forni, Hallin, Lippi and Reichlin (2000). These authors find the conditions under which some data averages become collinear to the space spanned by the factors when, the cross section dimension, becomes large. Moreover, their factor specifications allow the idiosyncratic dynamics to be mildly cross-correlated (an effect referred to as the 'approximate factor structure' by Chamberlain and Rothschild, 1983), a situation empirically verified in many applications. These findings have relevant implications. The most important being that the use of a large number of series is no longer representative of a dimensional constraint. On the other hand, it does help to identify the factor space. This new generation of factor models has been applied in several areas of macroeconomics and finance as well as for policy evaluation. It is consequently very likely to become a milestone in the literature of forecasting methods using many predictors. This thesis contributes to the empirical literature on factor models by proposing four original applications.

In the first chapter of this thesis, the generalized dynamic factor model of Forni et. al (2002) is employed to explore the predictive content of the asset returns in forecasting Consumer Price Index (CPI) inflation and the growth rate of Industrial Production (IP). The connection between stock markets and economic growth is well known. In the fundamental valuation of equity, the stock price is equal to the discounted future streams of expected dividends. Since the future dividends are related to future growth, a revision of prices, and hence returns, should signal movements in the future growth path. Though other important transmission channels, such as the Tobin's q theory (Tobin, 1969), the wealth effect as well as capital market imperfections, have been widely studied in this literature. I show that an aggregate index, such as the S&P500, could be misleading if used as a proxy for the informative content of the stock market as a whole. Despite the widespread wisdom of considering such index as a leading variable, only part of the assets included in the composition of the index has a leading behaviour with respect to the variables of interest. Its forecasting performance might be poor, leading to sceptical conclusions about the effectiveness of asset prices in forecasting macroeconomic variables. The main idea of the first essay is therefore to analyze the lead-lag structure of the assets composing the S&P500. The classification in leading, lagging and coincident variables is achieved by means of the cross correlation function cleaned of idiosyncratic noise and short run fluctuations. I assume that asset returns follow a factor structure. That is, they are the sum of two parts: a common part driven by few shocks common to all the assets and an idiosyncratic part, which is rather asset specific. The correlation

function, computed on the common part of the series, is not affected by the assets' specific dynamics and should provide information only on the series driven by the same common factors. Once the leading series are identified, they are grouped within the economic sector they belong to. The predictive content that such aggregates have in forecasting IP growth and CPI inflation is then explored and compared with the forecasting power of the S&P500 composite index. The forecasting exercise is addressed in the following way: first, in an autoregressive (AR) model I choose the truncation lag that minimizes the Mean Square Forecast Error (MSFE) in 11 years out of sample simulations for 1, 6 and 12 steps ahead, both for the IP growth rate and the CPI inflation. Second, the S&P500 is added as an explanatory variable to the previous AR specification. I repeat the simulation exercise and find that there are very small improvements of the MSFE statistics. Third, averages of stock return leading series, in the respective sector, are added as additional explanatory variables in the benchmark regression. Remarkable improvements are achieved with respect to the benchmark specification especially for one year horizon forecast. Significant improvements are also achieved for the shorter forecast horizons, when the leading series of the technology and energy sectors are used.

The second chapter of this thesis disentangles the sources of aggregate risk and measures the extent of co-movements in five European stock markets. Based on the static factor model of Stock and Watson (2002), it proposes a new method for measuring the impact of international, national and industry-specific shocks. The process of European economic and monetary integration with the advent of the EMU has been a central issue for investors and policy makers. During these years, the number of studies on the integration and linkages among European stock markets has increased enormously. Given their forward looking nature, stock prices are considered a key variable to use for establishing the developments in the economic and financial markets. Therefore, measuring the extent of co-movements between European stock markets has became, especially over the last years, one of the main concerns both for policy makers, who want to best shape their policy responses, and for investors who need to adapt their hedging strategies to the new political and economic environment. An optimal portfolio allocation strategy is based on a timely identification of the factors affecting asset returns. So far, literature dating back to Solnik (1974) identifies national factors as the main contributors to the co-variations among stock returns, with the industry factors playing a marginal role. The increasing financial and economic integration over the past years, fostered by the decline of trade barriers and a greater policy coordination, should have strongly reduced the importance of national factors and increased the importance of global determinants, such as industry determinants. However, somehow puzzling, recent studies demonstrated that countries sources are still very important and generally more important of the industry ones. This paper tries to cast some light on these conflicting results. The chapter proposes an econometric estimation strategy more flexible and suitable to disentangle and measure the impact of global and country factors. Results point to a declining influence of national determinants and to an increasing influence of the industries ones. The international influences remains the most important driving forces of excess returns. These findings overturn the results in the literature and have important implications for strategic portfolio allocation policies; they need to be revisited and adapted to the changed financial and economic scenario.

The third chapter presents a new stylized fact which can be helpful for discriminating among alternative explanations of the U.S. macroeconomic stability. The main finding is that the fall in time series volatility is associated with a sizable decline, of the order of 30% on average, in the predictive accuracy of several widely used forecasting models, included the factor models proposed by Stock and Watson (2002). This pattern is not limited to the measures of inflation but also extends to several indicators of real economic activity and interest rates. The generalized fall in predictive ability after the mid-1980s is particularly pronounced for forecast horizons beyond one quarter. Furthermore, this empirical regularity is not simply specific to a single method, rather it is a common feature of all models including those used by public and private institutions. In particular, the forecasts for output and inflation of the Fed's Green book and the Survey of Professional Forecasters (SPF) are significantly more accurate than a random walk only before 1985. After this date, in contrast, the hypothesis of equal predictive ability between naive random walk forecasts and the predictions of those institutions is not rejected for all horizons, the only exception being the current quarter. The results of this chapter may also be of interest for the empirical literature on asymmetric information. Romer and Romer (2000), for instance, consider a sample ending in the early 1990s and find that the Fed produced more accurate forecasts of inflation and output compared to several commercial providers. The results imply that the informational advantage of the Fed and those private forecasters is in fact limited to the 1970s and the beginning of the 1980s. In contrast, during the last two decades no forecasting model is better than "tossing a coin" beyond the first quarter horizon, thereby implying that on average uninformed economic agents can effectively anticipate future macroeconomics developments. On the other hand, econometric models and economists' judgement are quite helpful for the forecasts over the very short horizon, that is relevant for conjunctural analysis. Moreover, the literature on forecasting methods, recently surveyed by Stock and Watson (2005), has devoted a great deal of attention towards identifying the best model for predicting inflation and output. The majority of studies however are based on full-sample periods. The main findings in the chapter reveal that most of the full sample predictability of U.S. macroeconomic series arises from the years before 1985. Long time series appear

to attach a far larger weight on the earlier sub-sample, which is characterized by a larger volatility of inflation and output. Results also suggest that some caution should be used in evaluating the performance of alternative forecasting models on the basis of a pool of different sub-periods as full sample analysis are likely to miss parameter instability.

The fourth chapter performs a detailed forecast comparison between the static factor model of Stock and Watson (2002) (SW) and the dynamic factor model of Forni et. al. (2005) (FHLR). It is not the first work in performing such an evaluation. Boivin and Ng (2005) focus on a very similar problem, while Stock and Watson (2005) compare the performances of a larger class of predictors. The SW and FHLR methods essentially differ in the computation of the forecast of the common component. In particular, they differ in the estimation of the factor space and in the way projections onto this space are performed. In SW, the factors are estimated by static Principal Components (PC) of the sample covariance matrix and the forecast of the common component is simply the projection of the predicted variable on the factors. FHLR propose efficiency improvements in two directions. First, they estimate the common factors based on Generalized Principal Components (GPC) in which observations are weighted according to their signal to noise ratio. Second, they impose the constraints implied by the dynamic factors structure when the variables of interest are projected on the common factors. Specifically, they take into account the leading and lagging relations across series by means of principal components in the frequency domain. This allows for an efficient aggregation of variables that may be out of phase. Whether these efficiency improvements are helpful to forecast in a finite sample is however an empirical question. Literature has not yet reached a consensus. On the one hand, Stock and Watson (2005) show that both methods perform similarly (although they focus on the weighting of the idiosyncratic and not on the dynamic restrictions), while Boivin and Ng (2005) show that SW's method largely outperforms the FHLR's and, in particular, conjecture that the dynamic restrictions implied by the method are harmful for the forecast accuracy of the model. This chapter tries to shed some new light on these conflicting results. It

focuses on the Industrial Production index (IP) and the Consumer Price Index (CPI) and bases the evaluation on a simulated out-of sample forecasting exercise. The data set, borrowed from Stock and Watson (2002), consists of 146 monthly observations for the US economy. The data spans from 1959 to 1999. In order to isolate and evaluate specific characteristics of the methods, a procedure, where the

two non-parametric approaches are nested in a common framework, is designed. In addition, for both versions of the factor model forecasts, the chapter studies the contribution of the idiosyncratic component to the forecast. Other non-core aspects of the model are also investigated: robustness with respect to the choice of the number of factors and variable transformations. Finally, the chapter performs a sub-sample performances of the factor based forecasts. The purpose of this exercise is to design an experiment for assessing the contribution of the core characteristics of different models to the forecasting performance and discussing auxiliary issues. Hopefully this may also serve as a guide for practitioners in the field. As in Stock and Watson (2005), results show that efficiency improvements due to the weighting of the idiosyncratic components do not lead to significant more accurate forecasts, but, in contrast to Boivin and Ng (2005), it is shown that the dynamic restrictions imposed by the procedure of Forni et al. (2005) are not harmful for predictability. The main conclusion is that the two methods have a similar performance and produce highly collinear forecasts.