Investors often see research showing some kind of relationship between two variables. The nature of the relationship may not be clearly stated, so there is just the vague sense that these two things go together.
Take a look at this example, chosen because readers are unlikely to have an opinion before seeing the data. It is drawn from the very interesting and useful database at NationMaster, where one can look at a wide range of indicators, discover correlations and look at scatter plots.
Without any preconceived notion, would we expect to see a relationship, by country, between research activity and divorce rates?
In the example we suggest, there is an extremely strong correlation, r-squared of .83, meaning that 83% of the variation in one characteristic is "explained" by the other. The relationship is between the proportion of a country's research and development personnel and the divorce rate. Since we are not attempting to do a complete course in causal modeling in this brief article, let us just consider three basic possibilities:
- R&D personnel are more likely to get divorced.
- Getting divorced drives people into the research field.
- Some other (unspecified) variable causes both of the effects -- something that is called a spurious relationship.
While the conclusion is open to discussion, it seems likely that this apparent relationship is an example of the spurious type. There is probably something about economic development that involves both more research and a different social structure. Examining this would involve controlling for the possible cause and seeing if the original relationship disappears.
The wonderful NationMaster site permits a glimpse into a third variable on the two-dimensional chart. One can look at the relationship showing the flags of the various countries (where one might infer possible religious and social differences affecting divorce, or population, or land area, or GDP.
This sort of open-minded thinking is very helpful in interpreting apparent bi-variate relationships. Here are some examples:
- The widely-publicized relationship between President Bush's electoral prospects in 2004 and the stock market. Many suggested that the market "wanted" a Bush victory. We suspect (but cannot prove with available data) that this was a spurious relationship -- both caused by perceived economic prospects.
- Housing markets and recessions. There are so few cases that statistical analysis is more like data-fitting of a few points, but once again, it seems spurious.
- Yield curve inversion and recessions. Once again, this seems spurious. Both the recession chances and the yield inversion usually occur when there seems to be good reason to suspect falling inflation rates, usually associated with declining economic growth.
Once one understands the causal factors in the relationship, it is possible to think more clearly about what it all means. To take one example, the long-term bond "conundrum" may relate more to foreign reinvestment and/or the yen carry trade, than the typical indicator of economic weakness.
It is always difficult to infer causality, particularly in examples like those we offer, the total number of instances is very small. The possibility of a third variable, creating a spurious relationship, should always be part of the analysis. Any research report that does not mention this possibility, or consider causality as well as the data, is immediately open to question.