During 2004 a leading quantitative analyst predicted the the market multiple on the S&P 500 stocks would decline as interest rates increased, reflecting the Fed's tightening cycle. Yesterday we invited readers to examine the scatter plot presented and form their own conclusions. (Those who did not see the prior post may wish to look at it now.)

[click on this image and others to enlarge]

At the time of the research, the interest rate was about 4.2% and the P/E ratio about 17. The researcher noted the downward slope of the regression line, showing the logical conclusion that PE multiples decline as rates rise. On this basis, he reached a bearish forecast for U.S. equities.

**Always Look at the Data**

There are several things wrong with the analysis. The errors should leap out for someone who understands both the methodology and the underlying data. The problem stems from blindly applying the derived regression equation, probably the single most common error of researchers.

- The data, as plotted, do not really support the conclusion. The current interest rate (at the time of the research, December, 2004) of 4.2% was not actually on the regression line. It was below it. Rates could rise to 4.8% and the implied multiple would still be 17.

- More importantly, the regression equation does not fit the data very well for rates below about 6.25 percent. The overall fit of the model is excellent (R square of .79) but the plot shows heteroskedasticity. This means that the variance changes for different ranges of the independent variable (suggested by the researcher to be interest rates). Looking at the actual data shows that most of the observations in the 5% to 6% range are associated with multiples of about 20.

This is the VERY range where this researcher seeks to make a prediction! It does not support the conclusion.

Quantitative methods should not substitute for thought and observation. In this case, the actual observations do not fit the prediction because the model does not provide a good fit in the relevant range. As I test, I showed the plot to a leading expert in research methods who had no special knowledge about this particular question. His observation, drawn strictly from the data, was that one should expect the multiple to

as rates moved from the 4% range to the 5% range.increase

- The chart notes that data for thirty months (January, 1999 to June, 2001) have been omitted. It is often reasonable to exclude data that represent an extremely unusual event, especially one that is unlikely to be repeated. It does raise the question about how the omitted data might have affected the result.

In our reconstruction of the research, we have been able to restore the missing period, showing the impact on both the plot and the regression model.

The plot shows that the "tech bubble era" data reflect even higher multiples when rates are in the 5% to 6% range. The average P/E ratio is about 24. If these data had been included in the regression modeling, something that the research team certainly tried, their conclusion of expected multiple compression would not have been supported either by looking at the data or by the regression equation.

The overall fit of the model actually is just as good as the example that omits the "bubble" data. There is more of a curve in the equation (the result of the so-called quadratic term, the one that squares the interest rate), but the curve fits everything well, except for the cluster of data points in the 4% range.

**The Anomalous Cluster**

When first shown the original scatter plot I immediately circled the cluster of data in the 4% range. This is where some knowledge of the data comes in. As astute readers will have noted, this scatter plot is really just a different rendering of the Fed Model, which we have been studying. The Fed Model looks at the forward earnings yield, which is E/P. Most market gurus (perhaps not liking fractions) talk about market multiples or P/E.

Anyone with knowledge of the data would have had the same immediate reaction as I did, highlighting the cluster of points that did not quite fit. While there was nothing in the original research to prove it, I was quite confident in my assertion. The next image shows the values from mid-2001 through the end of 2004 in pink and with triangle shapes.

The excellent graphics work by Renae, our office expert, actually replicates the line that I drew around the cluster on the original chart. She has also added a curving line showing the PE multiple implied by the Fed model.

The plot thickens now! We see that the research team, already committed by previous reports to a bearish stance on the market, left out one unusual cluster of data that did not support their viewpoint. There may have been a good reason for this, but they included another cluster that seems just as aberrant. We certainly do not believe that the researchers were attempting to mislead their clients. It is the natural impulse when one is locked into a position. It is also what happens where there is no peer review.

Suppose that one includes both of the unusual periods and derives a regression equation. The result is depicted in the next image.

The regression equation fits the Fed Model line very closely! Also, both the bubble era and the modern data, what we see as a period of misguided gloom, stand out as events that do not fit either model.

In fact, if the research team had done a careful job they would have highlighted an important observation -- that current data seemed unusual. This would serve to focus the discussion on why this might be.

**Epilogue**

Since this story ended in 2004, we understand that readers may wish to know how it all turned out. First, the research team abandoned the original conclusion and went on to discuss a brand new theory of how market multiples were determined.

Second, since the ten-year note has not increased as much as the Fed Funds rate, the assumption forming the basis for the original question was never met. To satisfy reader curiosity, here is how the plot looked as last year ended.

The current anomaly, still represented in pink triangles in the plot, has persisted for two years. This is the sustained period of under-valuation reflected in our prior look at the Fed Model.

Are there other explanations? Of course!

(to be continued)

Very nice explanation Prof!

Posted by: Rower32 | February 17, 2010 at 05:18 PM

Tom -- Thanks for your comment! And you get the main idea. My analysis of this research was not really aimed at a particular market recommendation. It is about the difficulty in interpreting research.

I love Taleb's book, Fooled by Randomness, and bought many copies to send to clients.

Two things --

First, I do not believe that those doing this research were intentionally trying to deceive or mislead. It is very easy to think that you have made an exciting discovery, particularly when you are supposed to be cranking out new material every week. Academics have more luxury to re-examine and get peer review.

Second, if you stay with this series, you will see that some apparently non-statistical methods are just as misleading!

Thanks again.

Jeff

Posted by: oldprof | February 14, 2007 at 07:24 PM

This is one of the most important articles I've read in months. Not because of the specific market conclusions, but for the illustration of how dangerous it is to rely on statistical presentations coming from other people, even supposedly reputable sources. Nassim Taleb has talked a lot about how widespread this kind of thing is and this example is great. It has everything: misuse of fitting functions, massaging of data, misleading graphical depictions... A really excellent, concrete example and warning to the non-statistically-savvy amongst us out there.

Posted by: Tom L | February 14, 2007 at 12:41 PM

Thanks!

Posted by: oldprof | February 14, 2007 at 10:57 AM

You have an amazing site - well written and always an interesting visit. Just wanted you to know that I "clicked thru" from Seeking Alpha to ensure you got the hit for this wonderful piece of work.

Thanks.

Posted by: marlyn trades | February 14, 2007 at 08:05 AM

Nice. Look forward to your upcoming two-dimensional projections of multi-dimensional graphs :).

Posted by: RB | February 13, 2007 at 09:47 PM