So what does the home field advantage in baseball look like? In my database of almost exactly 200,000 baseball games which didn’t end in ties, the home team won 54.7 percent. But road teams did really badly in the early days of baseball. Before 1900, the home team won 59.1 percent. Post WWII, only one year had an advantage over 57 percent: 1978.

For simplicity, I’m going to assume baseball started in 1966. Of course, for many of us, that’s the year baseball actually started. That’s arbitrary, of course, and I’m not saying that the data couldn’t be mined to make interesting cross-era comparisons. But since I have a prior supposition that travel is important, I want to look at 162 game seasons played on both coasts and north and south with travel largely by airplane. If I go too much farther back all of those are in doubt.

That still leaves us with almost 100,000 games (97,632 to be precise). I should note that while the 2012 season is now available, it wasn’t when I made this database, but I’m going to fix that before the next posting. So for now I’m covering 1966-2011. The aggregate home winning percentage in this period: 53.9 percent.

The following figure shows the home advantage:

Now, a word about variance. In a typical year there are around 2,400 games. Suppose the home team has a 54 percent chance of winning each of them. This is like flipping a coin with a slight bias towards heads 2,400 times and seeing how often it comes up heads. We certainly wouldn’t expect to see 0.54*2400=1,296 every time. Sometimes we would observe more, sometimes less. That doesn’t mean that the true underlying tendency is really moving around in any way, just that the randomness of the process yields different results.

In fact, the post-1966 splits, season by season, look exactly how you’d expect them to look if the underlying probability were constant at about 54 percent. There are two outliers that are a little surprising (1978 is quite a bit too high and 1969 is a little too low) but you’d expect a couple of surprising runs in 45 years of looking. So I’m going to argue that everything we see at an aggregate level away from 54 percent is just noise. Despite the natural human tendency to see a trend, for example, in 2004-2010 it almost surely isn’t really there.

In the charts above, I’ve put annual 95 percent confidence limits in to show just how volatile this data can be even over a season. The shrinking of the limits reflects the fact that more and more games are played, and the two notches reflect the lower number of games played in the strike years.

This puts some real limits on what we’re able to get from this data. For example, one obvious way to get at the question of how big the home field advantage really is to look at games on neutral fields. In that case, the only home field advantage (presumably) results from the last-at-bat strategy. The problem with doing this is that there aren’t many such games.

We had a few played in Tokyo, a few in Puerto Rico, a few where weather caused the team to bat first even though they were in their home stadium (those games would be really really interesting if there were more than a handful of them) and a few other exhibition sorts of games. That said, without thousands of such games, you have no idea whether what you’re seeing is a real effect of “home-ness” or just variance – some lucky coin flips. Any proposed factor that doesn’t show pronounced effect is always just going to look like noise.

With that in mind, I can show you all the first test I did. Many have speculated that travel causes the home field advantage. Since in every other sport, the visiting team usually travels to a home team who is already home, the enervating effects of travel should affect road teams more than home teams in other sports, leading to a home field advantage.

This portion of the home field advantage will be reduced in baseball because of the series effect. Most baseball games are not played by teams that have traveled the day before. This leads me to suspect that the home field advantage should be lower in baseball because the probability of a traveling team being the road team is lower.

Note that of course both teams are in general tired all the time from travel. That shouldn’t affect this hypothesis at all. We only have to assume that: (1) travel yesterday lowers your chance of winning and that (2) it may be countered by the fact, if true, that the other team also traveled yesterday.

So I looked at all the games in which the team played in another city on their previous game and compared it to those games in which they played in the same city. I have begun to look at some wrinkles on this, like how many days it was since your last game and how far you traveled, but let’s ignore this for now.

For home teams, we have 14,000 travel days and 83,632 non-travel days. We find that the probability of winning on a travel day is 54.3 percent, against 53.9 percent on non-travel days.

This difference is nowhere near high enough to reject a simple variance-based result. And even so, the difference isn’t in the direction we had hoped. Now of course a home team in a new city has traveled home, so there could be a degree of elation just in ending a road trip which contributes, in however small a way, to a heightened probability of winning.

For road teams, we have 32,223 travel days and 65,409 non-travel days. (As you’d expect, if most series are 3 games long). Here, at least, the results are in the right direction. On the first game of a series, road teams won 45.9 percent of the time, while in subsequent games, they won 46.1 percent of the time. This could be due to variance as well, but at least the difference is in the direction specified by the hypothesis.

What about the hypothesis that both team traveling eliminates the travel advantage? We get the following percentages of the home team winning:

Both Teams Traveling: Home teams win 54.3 percent (14,000 games)

Road Team Only Traveling: Home teams win 53 percent (18,228 games)

Disappointingly, this once again worked in the wrong direction. Visiting teams who travel in to face a home team who has been sitting at home waiting for them actually do very slightly better than when both teams last played elsewhere. But more importantly, the difference is really tiny against the background of the underlying variance.

Now there’s a lot more things you can do with this, and we might. Some have already proposed looking at the advantage for good teams and bad teams, for distance traveled, and we can take this exploration anywhere people convince me that it would be interesting to go.

So I wanted to show that home field advantage in baseball is attenuated by an attenuated travel cost. My hypothesis appears to be incorrect. That doesn’t mean that travel doesn’t cause the home field advantage in any sport, but the effect surely isn’t one that diminishes over a day of nontravel. That’s what I know so far. Your questions, comments and criticisms are invited.