So this latest foray into the mathematics of baseball started when I started thinking about the new ABS challenge system. I have some insights about that which I’ll begin to discuss at some lull in the season — maybe when the Braves are up by 40 games in July — but when I started to lay out the strategic questions about when to challenge, I immediately ran into a problem that has bothered me for quite a while: what is the probable outcome when Batter A faces Pitcher B?

I think it’s incredibly odd that more effort isn’t put into this question; if you could answer it, you’d be able to predict the outcome of games much more accurately that I think people can now. In a game in which so much is about getting the matchups you want, you’d think more time would be put into determining how you know which matchups you want. We assemble all these statistics about batters’ abilties and all these statistics about pitchers’ abilities, but when Batter A faces Pitcher A, we fall back into two stories, neither of which is at all adequate: we either simply restate the pitcher’s and batter’s stats, or we discuss the actual statistics in the times they’ve faced each other. Neither is a very good solution. I don’t think there’s a perfect solution… but I have one I can prove is better than what we’re using now.

The Batter-Pitcher Matchup

When a batter faces a pitcher, there are many possible outcomes, but let’s start with something simple: the player either walks or he doesn’t. We know each pitcher has an underlying propensity to walk batter — some pitchers walks very few batters and some walk a lot of guys. We also know each player has a propensity to draw a walk — some guy’s walk a lot and some walk rarely. But if probability means anything, when Batter A faces Pitcher B, there can only be one probability that Batter A walks. If the pitcher walks one out every 10 batter he faces and the batter walks one out of every 15 at-bats, the probability of a walk in the current matchup can’t be both 10% and 6.7%. And, as I’ll show, it doesn’t even have to lie between those two values.

And of course what goes for walks goes for the other outcomes: strikeouts, singles doubles, triples, homers, hit-by-pitch, and fielding outs. (There are several other possibilities which I will ignore, e.g. reached on error and catcher’s interference, both of which I will treat as fielding outs. But let’s keep the focus on strikeouts for the moment.

There are two ways people talk about matchups:

  • Citing the matchups head-to-head. Head-to-head matchups are actually the best possible evidence, but the fundamental problem is that there are almost never enough head-to-head matchups to have any real meaning. If told “Joe has faced Sam three times and has a double and a single” do you take that as evidence that Joe is good at hitting Sam? It is evidence, but it’s not great evidence. Do you think Sam has no chance at walking Joe? That Joe has no chance of hitting a homer? Very few head-to-head matchups have more than a few dozen instances — you wouldn’t judge a hitter based on his firs two dozen at-bats — why would judge the matchup on similarly skimpy data?
  • Citing each players’ statistics against everyone. Here we have the opposite problem. The aggregate statistics for a pitcher and batter give good evidence about each of them individually (if based on enough appearances) but they say almost nothing about the matchup. A batter’s average is an average across all the pitchers he has faced. What you’d like to do is to adjust for the quality of this pitcher. And the same problem — adjusting for batter quality — is something you’d like to do for the pitchers as well.

The observation that there is only one probablity of some particular outcome in a matchup between Batter A and Pitcher B has within it a fundamental insight, though. The batter’s statistics, adjusted for pitcher quality, has to be the same as the pitcher’s statistics, adjusted for batter quality. And this is true no matter how many other things we adjust for: game situation, home/road, ball/strike count, etc.

So those who know of my fondness for Bradley-Terry models (that’s neither Milton Bradley nor Terry Mulholland, unfortunately) will be unsurprised that I have found yet another nail for this hammer. In short, Bradley-Terry models are like a fancy melding of all the observed outcomes when pitcher A faces Batter B. There are a bunch of technical details that I’m not going to discuss here, but what we end up with are two numbers, one for the batter and one for the pitcher for every outcome, such that there is a simple function which yields a probability. To make this simple, I have combined walks and hit-by-pitch as a single outcome, I have combined doubles and triples into one outcome, and I have a catchall category, outs, that include reached on error and some stuff like catcher’s interference. I have adjusted for platoon advantages, but not for home-road, count, or any of the other stuff that matters.

Anyway, the model is done, but rather than produce a lot of stuff about the implications of the model, I have made an app which will allow you to play with it yourself, even during a game to add to the numbers you’re bombarded with… because why not some more numbers? The program is at mlb-matchups.anvil.app and I’ve made it available as a link at the top of this page (Press MATCHUPS… when you close the app you return to Braves Journal.). Play wth it yourself if you are so inclined. Let me know if there’s anything interesting you’ve found. Note that the players are limited to players in the 2023-2025 seasons. Some rookie next month will not be available. But you can create matchups that have never happened. If you want to know how Spencer Strider could handle Austin Riley, that’s available.