My Photo
Note: Jeff does not accept guest blog posts on A Dash of Insight.

For inquiries regarding advertising and republication, contact

Follow Jeff on Twitter!

Enter your email address:

Delivered by FeedBurner


  • Seeking Alpha
    Seeking Alpha Certified
  • AllTopSites
    Alltop, all the top stories
  • iStockAnalyst
Talk Markets
Forexpros Contributor
Copyright 2005-2014
All Rights Reserved

« Torturing the Data: Investors take heed! | Main | The False Confidence that Comes from Cheating »

August 30, 2011



That's my 4sf total as well:)



Good point. In my experience with economists in a variety of contexts - some of them love their models far more than they care about the reality the models are supposed to describe. Between that, and people dredging for data that supports their preconceived notions, it's hard for statistical work of value to be produced at all.

Mike C

Thanks Gary. I didn't immediately think about the difference in sample sizes across the two splits.

As much as we would like to reduce market/economic analyis to physics/engineering problems, I think there is an inherent complexity full of feedback loops that make drawing exact conclusions difficult. Soros reflexivity comes to mind.

Ultimately, with any sort of statistical analysis of the market/economy, I think of a few things.

1. It is better to be roughly right than precisely wrong. And errors should be tilted towards safeguarding capital. Louise Yamada says there are only two losses in investing, loss of capital, and loss of opportunity, and there are always more opportunities.

2. Does the relationship make sense? It is impossible to form a logical reason why the price of butter in Malaysia should impact the U.S. economy/market. On the other hand, things like credit spreads and ISM numbers make sense that they accurately point north on the compass.


As I was going to St. Ives, I met a man with 7 wives....


Meanwhile the (me) less confident demand some feedback :-)

In lieu of a direct answer, when expressed with 4sf, the sum of the (my!) digits is 24.

Anyone agree?


I don't know anything about Kobe or Lebron, but imagine the following sample data:

Lebron shoots 466/555 for the first half (84%) while Kobe misses a bunch of games for injury and shoots 142/167 (85%). Kobe has the higher average.

Second half of the season, Lebron 329/417 (79%), but Kobe comes back & shoots 424/533 (80%). Kobe has the higher average.

Overall, Kobe shoots 566/700 for 81%, while Lebron shoots 792/972 for 82% - Lebron has the higher average for the season.

This effect can happen in any partitioned group - not just "first half" vs. "second half" of a sports season - but treatment groups in medical trials, groups you're analyzing for investment etc. So a medical treatment that looks more effective than an alternate for group and and for group B may look less effective than the alternate for the groups together.

In fact, distressingly enough, mathematically ANY group can be partitioned in a way that produces this effect. Meaning that if you look hard enough, you can actually invert the inferences you draw from your data!

The more different the sizes of the cells in the partitioned data (in this case, halves of the season and also different players), the stronger the effect can be. You can even have double paradoxes, I believe, where each player also shot better in the first half than the second half of the season, but overall it was better in the second half (not the case in this instance).


Francisco -- It's OK. Several people missed the request. I need to figure out a better method:)

Meanwhile, I am still getting answers from people who get the blog via email.


Mike C

This screening difficulty, for instance, or Simpson's Paradox-style difficulties, where Kobe can have a higher free-throw percentage than Lebron for each half of the season, but Lebron will have a higher percentage over the whole season.

Gary, can you show the math behind this one?


I missed the part saying not to publish the answer. I apologize for doing so.


The statistical problems of comparing groups of different sizes are manifold:

This screening difficulty, for instance, or Simpson's Paradox-style difficulties, where Kobe can have a higher free-throw percentage than Lebron for each half of the season, but Lebron will have a higher percentage over the whole season.

Other size-driven paradoxes?


Sent my answer in and apologize for being late to the party. I work as an engineer and we're in the midst of product testing so time is limited.

Hope to see Dr. Miller's answer soon. Thanks for a great financial site!


Answer in the post!
As put in the email, my personal take on this is...

Here's hoping that having had my two kids do the (UK) A-level maths I've been reminded of enough stats to get this right, especially as I sometimes remind them of the statement by Google's Val Harian "As Data is next big cheap thing, being statistically quallified is best way to go!" [or something like that].

Dave (UK)


JC -- I have temporarily "unpublished" a couple of comments describing an answer so that more people can enjoy the problem.

I'll "republish" them in a day or two. Meanwhile, as you note, it would be nice if everyone would just send solutions via email for later recognition.



JC Morris

Two thoughts.

First, Yes, I solved the riddle and it is as SI posted.

Second, (@SI), what part of "Please do not post exact answers in the comments, except to say that you have solved it" do you not understand? Our good and generous host was, I think, trying to see how many readers would leap to a quick and incorrect conclusion (and be willing to post it). But now, unless they agree with you and I, they likely won't post. C'mon, man. Of course people have known how to do it for as long as there has been arithmetic. That's not the point. The point is that even the simplest statistics are often misunderstood by intelligent (though perhaps innumerate) individuals. And the world of finance is full of both statistics and innumerate people.


I solved it, but i've seen the problem before.


"All your ba_es are belong to me."


Sorry for typing the same text twice. I would remove one but I am unable to edit the text.

The comments to this entry are closed.