Time for predictions: Math Edition

Let the crapshoot begin

So those of you who have nothing better to do than remember everything I’ve written here will of course recall my periodic forays into Bradley-Terry Ratings for baseball.  It’s time to use math to predict the playoffs.  Why not?  It can’t be worse than anyone else’s predictions, can it?  It may not be better, but no one will ever be able to prove that it’s worse.

A brief recap: to make Bradley-Terry Ratings, you need a dataset in which everyone has played everyone else a number of times, and you create a single number to represent the strength of that team such that you come closest to predict the actual record of that team in expectation over the season when every match between two teams is of the form:

Pr(Road Team Win) = RoadRating/(HomeRating x HomeFieldAdvantageFactor + RoadRating)

Let’s start with the rankings for the playoff teams.

There are a number of things you have to make Bradley-Terry ratings.  The first is to decide the period over which they are calculated.  The standard method is to use that season’s games.  Under that criterion, here are the ratings:

TeamBT RatingWins

But of course using the whole season’s games can be misleading.  Teams can change dramatically over the course of the year, as a certain team which retooled at the deadline last year might remind you.  What if we just ranked teams based on July-October?  That would yield the following rankings:

TeamBT RatingWins

You have to drop all the way down to 16th place to get all 12 playoff teams.  And the Braves rise to 2nd. (This proves that the whole season matters. Without those first two months, the Yankees would be fishing now.)

One more entirely biased method is to go from June 1 to the end of the year.  That yields this:

TeamBT RatingWins

Ahh… that’s more like it.  Now the right team is the best team.

There’s an important lesson here even before we begin playoff simulations.  Anyone who claims to be able to rank teams must have in their mind something that is the team and something about the timeframe they are measuring over.  None of that is a guarantee of performance in the playoffs, but you won’t get any better predictions out that the strength of the team you assumed going in.

So we can start with the most pessimistic Braves’ method: the full year estimate, which is standard. I’ll go through the playoffs in some detail, and then circle back to show the change in probabilities from only using post-May data.

The next step is to assign a home field advantage.  I am really skeptical of trying to measure a playoff home field advantage, so I’m just going to go with the sort of numbers I’ve seen in the literature and go with a 1.1 multiplier.  This really only has a big effect in the Wild Card round, where the better team gets up to three home games.  The following table gives the Wild Card round estimates:

Wild Card Round

For each team, this gives their probabilities of winning a game in the series and their chances of winning two out of three. So the Padres, for example, have a 42.9% chance of winning a game at Citifield, which gives them a 36.1% chance of winning two out of three. The last two columns are just the complement of columns (2) and (3). Note, by the way, how little a 2-out-of-three format changes the probabilities. That’s the basis for the basic insight that if you want to upset someone, your best chance is a short series.

Now we get to the LDS round.  The lack of reseeding under the new schedule makes this pretty easy. When we look at all the possibilities, we get this:

Game 1 VisitorGame 1 HomeMatchup ProbabilityPR VisPR Home

I have sorted these by the probability that the matchup occurs. Since the Mets have the highest probability of winning their Wild Card Series, it gets the first row. I list the Visitor in Game 1 and the Home Team in Game 1, the probability that matchup occurs, and then the probability that the Visiting Team (in Game 1) survives, followed by the complement, the probability that the other team survives. All the home teams are substantial favorites, which you’d expect.

Next comes the LCS.  Following the same structure as the previous table (though now we are playing 7 game series) we get the following:


The probabilities here require both teams to survive to this round, which is just the multiplication of each teams chances of surviving the second round.

Finally, we come to the 36 possible World Series matchups, sorted by their probability of occurrence, and in the same format as the previous two tables:


One thing you can do with this table is calculate fair odds.  If I wanted to bet that Seattle beats the Padres in the World series, I ought to get around 1,000:1 odds. 

We can then cumulate by team across this table and get the full crapshoot probabilities.  The chances of winning the World Series are:

TeamBT RatingChampionship Probability

How robust are these probabilities?  Well, suppose we use team rankings since June 1st.  The revised probabilities are:

TeamBT RatingChampionship Probability

That’s more like it, although the Braves are barely better than 4:1 odds. But it’s hard to refute these probabilities. They don’t say the Padres won’t win, only that it’s about 150-1 against. Current Vegas odds are around 30-1 on the Padres. That’s a terrible bet.

On the other hand, the current Dodger odds of about 3.5:1, while not exactly fair. are not strikingly unfair either. But the Braves odds of 6:1 are really good if you think the last four months are who the Braves are.

Author: JonathanF

Alive since 1956. Braves fan since 1966. The first ten years were pretty much wasted. Exiled to Yankees/Mets territory in 1974 --- bearable only with TBS followed by MLB.TV.

81 thoughts on “Time for predictions: Math Edition”

  1. One problem with Math is that these teams are not the same even series-to-series. Are the Braves the team that swept the Mets or lost 2/3 to the Marlins? That answer may be pretty obvious but it does point to how lineups, starters, match-ups, home field, exhaustion, and other factors (even weather) make it even more of a crapshoot. Just so many variables.

  2. @1: Yep. The hope, as with every mathematical model in history, is that the stuff that you leave out cancels out on average. Every team has better than average and worse than average days… That’s why the “better” team doesn’t beat the “worse” team every time!
    Philosophically, picking a number to represent a team and using that number to derive a probability of winning, rather than a prediction of who will in fact win is supposed finesse the problem through averaging. Does it? Maybe!

  3. I did not realize that Atlanta broke .500 for the 1st time this year on April 8, going 2 and 1. That was obviously much quicker than last year. However, we were never over .500 again until June 5.

  4. JonathanF, thank you for this!

    In terms of an order of magnitude, how much would the Braves’ probability of winning be altered if Spencer Strider is/is not available for the LDS/LCS/WS? On the one hand, he’s one of 25; on the other hand, he’s one of the best pitchers in baseball, and his contributions are a major component of the Braves’ overall probabilities.

    Similarly, how much are the Mets’ probabilities altered by the possible unavailability of Starling Marte?

    I’m generally guessing that the assumption that “the stuff that you leave out cancels out on average” no longer holds when you have a late-season injury to a 5-win player. Or, is all of this so fiddly that the additional noise affecting the precision vastly outweighs the slight potential accuracy boost?

  5. Hey crew! Just letting you all know that I’ve been working behind the scenes with a few people to get some much needed updating going on. Sometime before the first Braves playoff game, I’ll be introducing a new writer. If you haven’t noticed, we haven’t had any females around in a while and that is about to change.

    Also, the updating of the site is going to be costly and will likely happen early in the offseason. I’m not asking for charity, but if there are any of you that would be willing to help with the cost of updating, I wouldn’t turn it away. We are planning big things, but nothing that would take the soul of this place away.

    If you’d like to help, reach out to me at cothrjr at gmail dot com

    A little from everyone would go a long way toward the goal.

  6. We can only pray that CB Bucknor finds a retirement home a few houses away from Gibson. Let’s hope he joins him really soon

  7. @5: Strider’s a good example. There’s nothing that stops anyone from messing around with the rating, adjusting the BT ratings any way they wish and deriving different conclusions. But at some point, you’ve made so many adjustments that’s what’s left is your opinion, and whatever objectivity is in the method has melted away. That’s the point I was trying to make with the three different starting points for the data. There’s no theory that tells you one is better than another. To the extent that the Braves are the best team in baseball only when they have Strider as a starter (his first start wasn’t until May 30th, (which is also roughly when MHII showed up and just when the team won 14 in a row) then you have to decide what to do with the rating if he isn’t around as a starter. Using the full year gives non-Striderosity some weight.
    I have been a statistician for a long time. One of the tricks of the game is convincing your interlocutors that you’re just the messenger… the data has spoken, not you. And you may have tried to be as objective as possible. But the results aren’t just data… they’re data plus choices. We now take one step back and try to convince the skeptical interlocutor that our choices are reasonable, or objective, or at least not simply disguising an opinion under the cloak of a blizzard of numbers.

  8. Thanks, JonathanF!

    At least a couple of the left-out things – which I’m sure you’ve considered – may not always balance each other out:

    It doesn’t look like your team rankings take into account strength of schedule, only team records. This is most pronounced when looking at teams from different divisions (the Central & AL West teams played weaker schedules), but even teams in the same division don’t play exactly the same schedules. The Mets played the Yankees a few games as their traditional rival while the Braves played the Red Sox, and I think the Mets also got an extra game against Houston while the Braves got one against Oakland. For an extreme example of how this played out, in 2020 when Central teams only played Central teams, the seven Central teams that made the postseason, with a combined winning percentage of .550 compared to .581 for their postseason opponents, went 2-14 in the postseason, suggesting pretty strongly that they really weren’t as good as their collective record indicated.

    It seems like run differentials, separate from W-L records, would also have some predictive value in evaluating team quality. The Mets won two more games than the Yankees, but the Yankees had a +240 run differential compared to +166 for the Mets. That suggests to me that (if all else is the same) the Yankees might actually be the better team and perhaps should be favored if the unthinkable happens and we get a Subway Series.

    The model also doesn’t take into account the concentration of postseason innings into a team’s top pitchers, which, e.g., made the Nationals in 2019 a somewhat better bet than their record alone would’ve indicated. With fewer postseason off days this year than last because of the lockout-related compressed postseason schedule, this may not be as big a deal as it has been in other years.

    I’ll weaken my point by noting that, in comparing the Braves & Mets, these things do somewhat balance each other out. The Mets’ regular season schedule is slightly stronger, but the Braves’ run differential is better, and, while the Mets did concentrate innings in their best pitchers last weekend, it didn’t seem to help them much.

  9. @10, is CB Bucknor still as bad as he used to be (or used to be considered)? Seems like I’ve heard a lot more about Angel Hernandez the last couple of years and less about Bucknor, compared to hearing about both of them about equally a few years ago.

  10. @12: Actually, the BT method does include strength of schedule , since you are compared with the teams you played. When I did this last time, I used that fact to actually rate the divisions themselves and compared the teams ratings to what they would have been had they played a balanced schedule.

    If you look back at the comments to my first BT post back in 2013, you’ll see I had a discussion about using run differentials instead of wins. I’m hoping to do another analysis with using run distributions scored and run distributions allowed to characterize teams. But we have another method that already does that: Pythagorean ratings, which we know can diverge substantially from actual results. But if I get that second analysis done, I’ll bore Satchel P with that as well.

    Your last point, though, is very well taken. 40-50 games a year are started by guys who would never see a playoff game, then that’s a fair example of something that’s an issue. However, the easy argument would be that every team gets better by focusing starts on their best pitchers, so the only relative change (which is the only change that matters) is those teams which have disproportionate value in their 4th and 5th starters. It’s not that that doesn’t happen, but it’s just one of those many things you hope average out.

    An while I have everyone but Satchel P’s attention, let me point out one other thing I forgot to mention in my comment to AAR. If you’re going to adjust the Braves for the presence/absence of Strider, you need to adjust LA for the presence/absence of Gonsolin. You need to fully Soto-ize the Padres, etc, etc. Partial ad hoc adjustments are almost always worse than systematic ones.

  11. @14, “CB Bucknor was the second worst umpire this year”


    @15, that totally makes sense! However, with respect to this — “You need to fully Soto-ize the Padres, etc, etc” — I understand that you are offering it as something that is incredibly difficult to implement. However, if you were going to do it, methodologically, how would you do it? By intuition, it feels like the right thing to do!

  12. I guess 6% Cherokee Helsley didn’t want to have to hear The Chop again next postseason round.

  13. Cards with a 9th-inning meltdown. Single, walk, walk, HBP, single through drawn-in infield, fielder’s choice runner safe at home, 3rd baseman ole-d a grounder, sac fly. Phils up 6-2 and still batting.

  14. Hopefully the Cards/Phils goes 3 games. Max Fried vs a 4th starter for Game 1 sounds good to me.

  15. @16: I really don’t know how you’d do this. Maybe you could draw a relationship between BT Ratings and aggregate WAR and then adjust the rating as known substantial WAR players come in and/or out of the lineup. But even as I write it that sounds really crude. One of the problems is that you’d now have to use the model risk from the BT-WAR relationship to further smear the expected probabilities. And if that make little sense to you, I understand it, but am baffled as to how you’d do it.

  16. Look, neither of the Cards and Phil’s are as good as we are, tho of course they can win a short series. But superstition and familiarity make me want to see the Phillies in the NLDS, so I’m happy to see this result.

  17. Playoff series odds need to account for who’s actually going to pitch. I think it might take a lot more manual labor than I or anyone wants to do, but you basically should eliminate all game outcomes during the season pitched by the 4th, 5th, 6th and higher starters. Especially in the shorter series.

  18. I like to think that MLB has an unwritten “Three riots and you’re out” rule, in which case Holbrook is only one riot away.

  19. @26: I hear what you’re saying, but do you know (with evidence) that the fact that everyone leans heavily on the top starters actually matters… with teams good enough to make the playoffs playing each other? I can concoct examples where it happens (a team with five #3s that pounds the cover off the ball). But are there such teams. Have they failed because of poor top starter matchups (and not for some other reason?) I dunno…

  20. Scherzer has given up 4HRs and 7 runs. Wow. I guess the Mets waited until Oct to do their Mets-ing.

  21. I think Scherzer may be shot for another postseason. He may need to consider either retiring or adopting the end-of-career Roger Clemens schedule.

  22. One thing all those numbers do not account for are three players who do something they never did in their major league career.
    Or beating a team three games in a row when that team had the better starting pitcher in each game. And a great closer whom had no effect.
    Those of a religious perspective would say two miracles. Do the Braves have a third.
    Knowledgeable people believe in these numbers. 333 3 hot starting pitchers. 3 hot relievers. 3 hot hitters. I think the Braves may have 3 of each. Maybe more. Many other teams also have the same. A miracle may be required. Simple as that.
    Loudon Wainwright was exceptional.

  23. I guess the NL results today show that it’s a crapshoot. Objectively I thought the Mets and Cards were pretty heavy favorites. I guess our own history should be enough to make me recognize that the better team loses quite often. I’m still not over the 96 team.

  24. Well, I certainly laughed at @22

    It would be nice for all games to go three, just for more drama on Sunday. If SD gets through its definitely good for LA, I can’t see the Padres offering much resistance.

    Also, I really like this method of evaluating teams JonathanF. The less adjustments the better, in my view, so a full years stats, or weight them by proximity to the post-season.

  25. I believe the Mets are barely good enough to make the playoffs if both Scherzer and de Grom are below average. They are in big trouble if those guys don’t turn it around.

  26. Scherzer says he’s healthy. If so, four homers and seven earned runs is kind of mind-blowing. He couldn’t deliver for the Dodgers last year either.

  27. @35: (Satchel/Ed/Thelonious): You are correct. That’s why it only produces probabilities, not predictions.

    PS: Loudon Wainwright is always outstanding.

  28. Jonathan F
    That is why they are a waste of my time. I deal with reality.
    But you and others enjoy them so continue.
    My predictions for today for the heck of it. Little time wasted.
    New York
    Tampa Bay
    Atlanta against Philadelphia or St.Louis
    So we will see if I wasted your time.

  29. Tampa Bay and Cleveland, two small market exceptional organizations. Shame one has to lose. Two very good managers whom are best friends.
    Outstanding pitching.

    Speaking of pitching, or the lack there of. 16 teams has zero shutouts. Two had two. Complete games almost unheard of.
    Quality starts one third.
    Wright 21 wins 19 QS
    Morton 9 and 9
    Strider 10 and 11
    Less QS than wins
    The bullpen is more important than the starters. Chew on the sorry number. I spit it out.

  30. No Manfred Man in the playoffs??? CLE/TBR in 10th and no man on 2nd.

    P.S. No XBH in the entire game so far.

  31. Was good for 162 games, now no good. How MLB is that? Next year goofy will be the norm. They are going to play each game with the managers using the computer version of APBA.

  32. Sciambi and Glanville have done a very good job on the call, I think. I’m frankly astounded they’ve managed to get this far without saying “You can’t win if you don’t score.”

  33. Talent yes
    Baseball intelligence no
    I read an article yesterday in The New York Times or the WSJ where a college fired a professor because his students complained that the tests were too hard.
    Dumb down daily.

    15th. inning here we come.
    Toronto did score

  34. Oh man, just like Willie did it many, many years ago.
    Square pants does it.
    What a game.

  35. I have a friend who is a Toronto fan and I would not wish that loss on anyone, including the fielding misplay that cost a major injury and the tying three runs.

  36. Life is often cruel. Ray’s score one run in 24 innings.
    Seattle does the unbelievable.
    Diaz in the seventh. Buck learned.
    I admire the talent. I find the lack of baseball intelligence depressing. Horrible base running. Throwing to the wrong base. Missing the cutoff man. Failure to advance runners. Pitchers getting ahead of hitters than walking them while seeking the perfect third strike. Pitchers who can only throw one pitch effectively. Most hitters have not a clue. The best time for pitchers.
    I say pitchers not throwers.

  37. So Braves vs. Phillies NLDS.

    The lack of the dumb Manfred Man reminds me of… I think it was the 2008 World Series? Phillies were leading in the clinching game and it was pouring rain, and if they called it, they’d have won it all, and apparently Selig was going crazy about that happening, so they kept playing in said rain and immediately stopped the nanosecond the Rays tied it. MLB quickly changed the rule after that postseason so that games in the playoffs can’t be decided by rain the way regular season ones can be.

  38. No M M is the way the game should be played all season.
    MLB will continue to fuck up a great game. Other sports which have become more popular than baseball make few changes, generally for the better. Not baseball. They make moronic moves just to prove how stupid they are. Damn shame.

    Since I believe I was 3 of 4 I will go with Braves 4-1 against the Phillies. See you in Atlanta

  39. Not that I would ever want to disagree with you, but I think baseball changes the game far less than basketball, football or hockey…. not only that, but the changes they do make to the game are smaller, at least to me. That doesn’t mean that all those changes make the game better… not in any sport. But I think the changes in baseball feel bigger because the relative timelessness of baseball makes any change seem big.

  40. One would presume Wheeler and Nola are out until Games 3 and 4, unless they go on short rest, which doesn’t seem ideal if your goal is to go deep into October. Advantage Braves. We’re obviously the better team, but anything can turn in a best-of-five. I’ll take the Braves in 4.

  41. 61 — I think Wheeler would be on normal rest for Game 2 and Nola for Game 3. But, neither of them could make 2 starts unless it’s on short rest.

  42. Wheeler pitched Friday, so Tuesday would be 3 days rest? And Nola pitched on Saturday, so Wednesday would also be 3 days rest? Or do we call that 4 days?

    I think the ghost runner should be delayed until the 11th inning in the regular season, and maybe the 12th in the playoffs.

  43. No ghost runner in playoffs, period. They play all night in hockey, and they can play all night in baseball too. In the regular season, you could talk me into starting it in the 12th or 13th inning.

    I wholeheartedly agree with JF @60. For a long time, baseball made almost no changes whatsoever, even when they were needed. That makes the fact that they’re starting to pick around the edges a bit and make a few small changes seem like a cataclysm. It’s not. While it went the opposite way versus than I wanted it, the fact that they let the AL and NL have different rules re: the designated hitter for 20 or so years even after they made it so the leagues are no longer separate entities is patently absurd. The fact that they’ve just now started to come up with some minor solutions for the interminable length of games when that’s been a problem for a long time is also ridiculous.

  44. And for Game 3, Nola would be on 5-days’ rest… So, sweep them at home, then take at least one of 2 vs. their aces. Easy, right?

    I didn’t see the Cle/TB game yesterday, but I’m pretty sure I would’ve loved all that tension & desperation. I may be in the minority, but I’m in no hurry to get these games over. Post-season, extra-inning games (gimmick-free, of course) can be as riveting as anything in sports.

    These days, Tech beating anybody in football is a cause to whoop it up on The Flats. Good for them.

  45. I guess as long as the Dodgers are in the playoffs, we’ll never get a night game (unless we’re playing against them). Makes no sense to me to space the games out so much. Especially to put the defending champs’ first game at a time where virtually their entire fanbase will be at work for the entire game. Just start 2 in the late afternoon/early evening and the other 2 at the traditional primetime start. Most people are gonna watch their favorite team and then maybe some of the other games but who is trying to watch all 4 games in their entirety?

    PS now would be a great time to stop sucking at day games.

  46. It’s all about TV ratings, TV markets & big bucks. That’s it. Neither fairness nor respect enter into the equation.

    When LA & NYC have teams in the DS part of the tournament, those markets see the most-appealing start times. More eyeballs, more money. End of story.

    And… nice start, San Diego.

  47. The Mets are beclowning themselves. They may still win this game, but unless you have credible evidence of cheating, you should shut up.

    David Cone just called it “gamesmanship.” I have a shorter, more Anglo-Saxon name for it.

  48. Even the Mets radio crew called it a “vintage Buck Showalter” move. Just trying to get into Musgrove’s head.

    Now the crowd is yelling “cheater.”

    Get a grip.

  49. That was really weird and awkward. To his credit, Musgrove remained calm. Could you imagine how Scherzer would have reacted to that?

  50. The Mets are about to watch their season come to a richly-deserved end. After a full season of folks saying that they’re too good and they’re not gonna turn into a Mets-like farce…well, they’re the full Mets clown show now.

  51. Supposedly the spin rate on Musgrove’s curve is a lot higher than average, per the radio broadcast.

  52. @79

    I forgot about that from earlier in the game, but that’s a good point too. And there was a very sketchy beaning of Grisham earlier in the game, as well.

    Long story short, to paraphrase JF, the Mets are embarrassing themselves.

  53. Johnathan F
    Of course there are many more important sports than the big four. Not that Americans would know or appreciate. I say a sport that had two sets of rules for over 20 years is fucked up. Putting a runner on second base is fucked up. Bigger bases is fucked up. Forced to face three batters is fucked up.
    Forcing player positioning is fucked up. Soon pitchers will not be allowed to throw over 92 miles an hour. All in the effort to make a cerebral game a fantasy game.
    Some morons put catsup on steak. Those are the fans(?) the game wants.
    Disagree anytime you want, it is okay. Being incorrect is also okay.
    But I get the feeling you think you know all.

    See you at the game Tuesday and Wednesday.

    At least the Jets and Giants won.

Leave a Reply

Your email address will not be published. Required fields are marked *