Interpreting data requires special skill and training. Hardly anyone has developed these skills, but that does not stop them from offering opinions. Here at "A Dash" we have highlighted some of the most popular errors. Today provided some examples.

This is America! Let the games begin....

**Public Opinion on Stimulus Plan? Thumbs Down!**

It is the six-month anniversary of the passage of the stimulus plan. This was a classic political compromise. Any and all observers could criticize the plan. The symbolic language -- shovel ready -- conjured up the wrong image. The announced plan suggested that there would be a lot of visible jobs from construction. The actual plan relied upon macro-economic spending principles, and was scaled out over two years.

The unsurprising result is that the economy has still not recovered and most Americans believe the program is not working. This includes plenty of intelligent people and Obama supporters. To understand the impact, one needs to grasp the concept of the counterfactual (which we explained in easily understandable terms here) and the economic impact of policies like helping the states and extending unemployment benefits.

Anyone who understands economics knows that the stimulus has already mitigated a difficult situation. It is to the political and personal advantage of many to attack the program.

**Pundit Opinion on the Economy**

Most of the punditry understands behavioral economics. They know all about confirmation biases and the temptation to see each data point as support for a pre-conceived idea.

It makes no difference.

The market is looking hard for signs of economic growth, reacting to each twist and turn of the data. Participants are missing the big lesson: What constitutes normal progress in an improving economy.

It is unrealistic to expect each economic indicator to show progress in lockstep. The data will provide mixed signals -- partly because of differences in sectors and partly through measurement errors. If one chooses to look at the worst news or find the worst interpretation of data, one can be in denial for a long period of time.

It is all about finding the best indicators in an objective fashion.

**What is Normal?**

In our summer quiz, we highlighted a question about data interpretation which few understand. We ran this article in June. We expect that our quiz questions will eventually be recognized as the key market issues.

Here was Question #1.

We have an honest coin which we flip 20,000 times. We keep a running count of whether heads or tails is in the lead. How many lead changes would you expect? (A good ballpark answer could be the winner here).

In his excellent book *The Drunkard's Walk: How Randomness Rules Our Lives*, (now added to our recommended list) Leonard Mlodinow explains how people -- smart people -- fail to understand data because they do not understand probability. His example involves a test of two movies, but the application is equally valid for many applications. Two theories, two investment managers, two pundits.....

He writes as follows (page 14):

Because the coin has an equal chance of coming up either way, you might think that this is experimental box office war each film should be in the lead about half the time. But the mathematics of randomness says otherwise: the most probable number of changes in the lead is 0, and it is 88 times more probable that one of the two films will lead through all 20,000 customers than it is that, say, the lead continuously seesaws. The lesson is not that there is no difference between films, but that some films will do better than others even if all the films are identical.

Mlodinow is a physicist who has written for popular entertainment like Star Trek and MacGyver. He is excellent at making difficult concepts accessible to average intelligent readers. He provides many other examples of normal random data, consistent with the work of the behavioral economists.

The key here is that a concept in the lead (the weak economy) will seem to retain the lead.

**Our Take**

We have attempted to highlight the important indicators. -- ECRI, initial claims, and the ISM -- and we will work to highlight others. These important measures have been positive. Many other measures that we follow -- employment, Michigan sentiment -- are giving a negative signal.

For stocks, it is all about earnings. Bearish pundits dismiss the last round of earnings because the results were based upon cost-cutting rather than revenue growth. (What would they have said if earnings had disappointed?) We believe that revenue growth will soon follow with stimulus impact. Companies that have good cost control will show good earnings gains.

Meanwhile, those emphasizing trailing earnings can fixate on forced write downs from last year and the post-Lehman cessation of normal lending. These pundits will miss the resumption of normal and sensible lending. It is time to look ahead for opportunity.

Or one can fixate on last year. Your choice.

Also, note that 2 times 5C2 is the same as 6C3. Therefore, for odd number of flips, the number of paths for the n'th flip is the (n+1)th central binomial coefficient.

Posted by: RB | August 23, 2009 at 11:55 AM

BTW,

You'd have to multiply by two since, for instance, a 1-flip has two possible outcomes with no lead changes.

Posted by: RB | August 23, 2009 at 11:17 AM

Gary,

"Central binomial coefficients: C(2n,n) = (2n)!/(n!)^2."

That's the same result as I had above.

Posted by: RB | August 23, 2009 at 11:14 AM

I found the sequence - the central binomial coefficient: http://www.research.att.com/~njas/sequences/A000984

It's the number of no-lead-change pathways for the odd-numbered trials. Link above has a lot of other examples of where that sequence is useful.

Posted by: Gary | August 23, 2009 at 09:49 AM

Last post on subject: the number of paths that don't have any lead changes for an odd number of flips is two times the combination (2n+1)C(n) i.e., 2*(2n+1)!/(n+1)!n!. For an even number of flips, the number of paths with no lead changes is two times the combination (2n)C(n), i.e., 2*(2n)!/(n!)(n!). Apologies for the bazillion posts.

Posted by: RB | August 21, 2009 at 11:12 PM

er, sorry for the triple-post. 13110 with no lead changes?

Posted by: RB | August 21, 2009 at 02:23 PM

er... i mean the number of paths with lead changes (p.s. after 15, is it 13110 out of 32768?)

Posted by: RB | August 21, 2009 at 02:22 PM

Gary,

Can you please post the number of lead changes after 11, 13 and 15 flips? Thanks..

Posted by: RB | August 21, 2009 at 02:19 PM

And by the time you get to 15 flips, you're at 60% of endpoints having at least one lead change somewhere on the path. As you go forward, the probability of developing a big lead increases while the chances of having a lead change occur over the next n trials goes down. But even at 15 flips, almost 40% of the paths leave you in an ahead-by-one situation, ready for a lead change.

I just wish I could get my head to do the closed-form solution of this (after n flips, y probability of a lead change). Alas, no.

Posted by: Gary | August 21, 2009 at 11:57 AM

I don't trust my closed-end solution capabilities, which is why I run simulations to help my intuition. Simulation gave me a lot of small numbers of lead changes, not so many zero lead changes.

VennData - while only 25% of the _endpoints_ may have a lead change, far more than 25% of the _paths_ will have a lead change somewhere on your tree. All the lead changes happen on odd flips. Flip the 5th time and you end up with 12/32 of the endpoints having a lead change somewhere. So we're already to 37.5% having at least one lead change. By the time you get to nine flips, you're at 260 lead changes out of the 512 pathways (just over 50%).

Do this enough times (20,000 for instance) and you will get lead changes on the vast majority of paths. I think the point was that you wouldn't get thousands of lead changes in the course of this many flips, just tens.

Posted by: Gary | August 21, 2009 at 11:30 AM

RB -- Sorry to be MIA for a couple of days. One of my quarterly trips to Seattle for a board meeting.

I also have some email on this topic, from good sources. I'll follow up ASAP.

The main idea is that no one knows what normal looks like, so our comparisons are bad.

Meanwhile, Patrick is on the money.

I may have to expand the range of "correct" answers on this question

Thanks to all.

Jeff

Posted by: oldprof | August 20, 2009 at 09:44 PM

"How many lead changes would you expect?"

That sounds to me like the "expected value" not the mode. Perhaps the good prof can clarify. Cheers!

Posted by: RB | August 20, 2009 at 09:02 PM

But that's not the question. It's not "What is the expected value?" (in the binary tree case, the mean average of all the endpoints.)

The prof's question is about the mode: "What is most likely number to expect?" Which is zero.

Posted by: VennData | August 20, 2009 at 08:24 PM

I mean "the number of lead changes may be close to zero, but is still positive"

Posted by: RB | August 20, 2009 at 11:46 AM

Venn,

Even with your illustration, the number of mean changes may be close to zero, but is still positive. The mean E(x) is zero but the mean translation distance E(|x|) is on the order of sqrt(20,000). This number is much closer to zero than 20,000 and therefore would imply a non-zero expected number of lead changes. It would be a fairly good guess that number of lead changes also grow as sqrt(N).

Posted by: RB | August 20, 2009 at 11:45 AM

Draw an inverted "tree" diagram representing coin flips. The upside-down tree structure shows after one flip you have zero lead changes. Keep drawing.

After two flips you've got four outcomes (either 2-0 or even: HH, HT, TH, TT) So after two flips you haven't had a "lead change." ...and you've got a two-flip "lead" in half of your expected outcome branches.

Keep going... after the third flip you've got eight outcomes with six of them with zero lead changes. Six out of eight! HHH, HHT, HTH, HTT, THH, THT, TTH, TTT.

Now you can make it simple by realizing that "half" of your tree is the mirror of the other. So after four flips you've got sixteen flips and only four endpoints with "change" outcomes. That's 75% with zero lead changes.

You can draw out all 20,000 if you want, or see that you will be close to that 75% number of zero lead changes for all end points (aka recursion.) Go ahead, keep going if you need to, but you should "see" the solution.

Commenters, your intuition is wrong or you're not using the correct mathematical tool to make your evaluation of the problem.

Zero lead changes is what you should expect. No other number comes close.

Posted by: VennData | August 20, 2009 at 10:23 AM

I find it fascinating how much technical analysis has come to dominate CNBC and Bloomberg. I don't know if it has taken over industry wide, but certainly, if you are viewing these channels you are exposed to analyst after analyst (or astrologer if you don't believe you can predict the future by drawing graphs lol) talking about TA.

But what is even more fascinating is how the TA guru will dismiss his own chart pattern. He/she will say, well, this bullish move is only because of this specific thing (i.e. cost cutting). That it is, "different this time." Never do they go back on the charts and cross off areas because this was a one time event. You could pull out a chart of the S&P 500 to 1926, and pick out some news of the day for each bump. But they never invalidate charts for that reason. This actually happens more with economic data than stock data (i.e. copper is only rising because of Chinese stimulus...well ok, but are we doing TA or FA and why did copper rise in 1992? I'm sure there was a "reason")

They have a pre-conceived notion and even dismiss their own charts when it doesn't fit. On national TV no less.

Posted by: Patrick | August 19, 2009 at 08:27 PM

My guess is that the probability that you will have 'k' lead changes has a sqrt(1/k) relationship to the probability that you will have '0' lead changes. The expected value is of course the summation of p(k)*k (from k=1 to k=9999). Still thinking of an approach.

Posted by: RB | August 19, 2009 at 12:56 PM

Zero might be the most common outcome, but not represent the "most" compared to the cases with changes. So the expected (mean) number of lead changes would be greater than the modal number (zero).

And after the first toss establishes the lead, it's true that the next two tosses give only a 1/4 probability of a lead change. But they also give a 1/2 shot of putting you back in the same ahead-by-one condition, so you have another 1/4 shot of a lead change, 1/4 of a big lead, and 1/2 of another ahead-by-one, recursively, etc. Keep multiplying .75 * .5 and eventually the probability of having zero lead changes after 20,000 flips gets pretty darn small (1% in my simulation).

Posted by: Gary | August 19, 2009 at 12:48 PM

It's zero. After the first flip, one item is ahead. You need two flips in a row to go the opposite way ...in a row... to get a lead change. That's a one in four shot. Applying that recursively (to build a huge inverted tree) means the most likely number of lead changes is zero.

Think of the end points of the recursively created inverted tree (with the nodes totally the lead changes to that point.) Most would be 'zero' changes.

Posted by: VennData | August 18, 2009 at 07:59 PM