October 23, 2007


Hedge Fund Managers

I'm confused on what your conclusion really means. I work in the hedge fund industry and there is definately traders and professionals who have an opinion to begin with and those who lead the markets with their actions.

- Richard
Hedge Fund Managers Blog


While you claim your responses to your analytic framework were primarily of the 'too many notes' variety, my null hypothesis is that when you use a baseball stat analogy for a financial point on a blog you will get a statistically insignificant number of baseball comments. Which is false, ergo proven!

Bill aka NO DooDahs!

I admit I'm rusty here, but you see about 44,000 innings pitched in a season (30 teams, 2 per game, 162 games, 9 innings per team). Taking a SWAG at how many start with either a leadoff tater or walk, let's say one out of every four IP, maybe a fan can step in with a stat or reality check of that assumption. Assume further that each situation is equally likely, and the average for both is 27.8%. Regardless of the number of IP with either situation, if my hypothesis is that they are equally likely to create multi-run innings given that they have already happened, the null situation is that the weighted average probability should lend the case.

So with SQRT[0.278*(1-0.278)/12,000] = 0.004 being one standard deviation, both the observed stats are about 1.5 Z from the combined mean, which is not significant. ASS U ME ing a normal distribution, or close enough for "government" work.

If we have fewer than 12,000 IP with the combined situations, the STDEV would be higher, meaning that the difference of either percentage from the combined mean would have to be larger in order to be credible.

In my professional opinion, 27.2% is indistinguishable from 28.4%. Unfortunately, I have dealt a lot with actuaries who mistake precision for accuracy – they would carry those percentages to 6 significant digits, and then management would want to vary rates based on the 4.4% difference in frequencies (28.4/27.2), etc.

If we saw the same difference over multiple seasons, that would be an additional measure of credibility, in addition to the ability to combine multiple seasons (ASS U ME ing that they were comparable, perhaps not a robust assumption) to lower the STDEV of the binomial approximation.

However, if the percentages were widely different for several seasons (in the low 20's, low 30's, different years have different situations higher than in other years, etc) I would have to say that not only are the odds indistinguishable, but meaningless.

A good question would be if the odds of getting one or more runs in an inning, given that the first batter either got a home run or a walk, are statistically different from the odds of getting one or more runs in an inning given that no batters had been faced yet in that inning. Think about it.


Bill - My guess is that the number of cases used is large enough go establish "statistical significance" if one wanted to do that. As to substantive significance - I just think that McCarver expected a big difference and one percent or so either way is not what he thought.

Meanwhile, there are some sabermetric studies that show that once a ball is put in play, the skill of the pitcher has nothing to do with whether or not it is a hit. Pitchers are good because of high strikeouts and low walks, creating a dominance ratio. It would be fun to see whether your suggestion about overall pitcher skill affected this. We would need a good measure for "skill."


Bill aka NO DooDahs!

The last Fantasy Baseball league I participated in had a category for WHIP - (Walks + Hits) divided by Innings Pitched. Combined with SO per Inning, you can learn a lot about pitching performance.

That said, I'm not a fan, I just participated for the love of modeling. I'm a geek.

I would be interested in the denominators for the 28.4% and 27.2% percentages, and a two-tailed hypothesis test that they were different (and at what confidence level).


Bill -

Actually, I did not offer an opinion about causation. I want people to understand that unsystematic observations can be very misleading, so I am trying to stick to that point. My own interpretation is that it does not make much difference, which I find surprising in the opposite direction from McCarver.

McCarver has a causal model. I am not sure about the bloggers cited by the Numbers Guy. Your suggestion for a statistical control could be tested easily enough. I think there is some sabermetric research suggesting that good pitchers have more control over walks than they do over home runs.

Thanks for an interesting comment. I wish I had more time to do baseball research!


Bill aka NO DooDahs!

"But the numbers show that it’s easier to get one more run with the bases empty, than getting two runs when starting with a runner on first base. "

I think that both he and you are confusing causation with correlation!

Multirun innings probably occur more frequently in innings that start with leadoff homeruns because the quality of pitching relative to hitting is poorer, on average, in those games. The causal factor is most likely this mismatch of talent, and NOT the situation.

The comments to this entry are closed.