Regression to the Mean Yet Again

So as I sit here at noon, the Braves are 42-21, which very fancy middle school math suggest a winning percentage of 0.667, with 99 games to go. That’s 99 problems; is regression to the mean the 100th?

Suppose every team in baseball was exactly 50% likely to win every game. What would such a team look like after 63 games?

I did this two ways: the first way uses the analytic mathematical distribution, and the second one simulates 63 games 500,000 times. As you can see, the red line lines up with the blue bars, so we’re just going to be simulating from now on.

As you’d expect, a 50-50 team after 63 games is probably around 31 or 32 wins. the dotted line is the Braves actual wins of 42 wins. That certainly suggests that we aren’t a 50-50 team unless we’ve been really lucky.

The old adage has it: “It’s better be lucky than good.” That’s actually true in a short playoff series, unless the skill differences are huge, but it is absolutely not the case during a 162 game season, because skill predominates over luck over a long period, unless the luck is huge.

Ok. Now let’s come at it the other way. Suppose every team’s currently observed winning percentage is their actual skill at winning games. Atlanta has won 66.7 percent of their games, so if we fix that and look at how many wins they would have gotten this season so far.

If you don’t look at the x-axis, those graphs don’t look that different. But there’s an x-axis. Now 42 wins is pretty much what you’d expect, with the variance around that as luck. This is the sort of graph the Mets are counting on. They only have 27 wins, but maybe they’re just a really, really uniucky 67% team up to now. Let’s put the two graphs together in one chart:

It’s very clear that Atlanta looks a lot more like a 67% team than a 50% team. And 67% teams average 109 wins a year.

But now it’s time to think about the dreaded regression to the mean. All that means is that the yellow graph assumes that the Braves have had an average amount of luck up to now. That’s…unlikely. A moment’s reflection should tell you that when you look at the spreads around the means, a team with a great record has probably been somewhat lucky. 108 win teams are really rare, but teams that have a 67% winning probability for 63 games are much more common.

In fact, let’s just consider the last 25 years. (Ignore 2020, so it’s 2000-2025). In that time, with 30 teams and 25 seasons, exactly 3 teams have reached 108 wins or more: Seattle 2001, Boston 2018 and LA 2022. Thats 3/(25*30), or 1 out of every 250 teams. However, in their first 63 games, 19 teams have gotten 42 wins or more. That’s 1 out of every 39.5 teams. Why does this happen? Because a lot of those 42-21 (or better teams) were both good and lucky. Of course, a few of them might have been hugely good and unlucky, but we know there aren’t that many hugely good teams. Had there been, we would have seen a lot more 108+ win teams. Half of all teams that are underlying 0.667 teams win more than 108 games.

The same thing happens at the bottom of the distribution (looking at you, Mets). Very few teams are as bad over a whole season as the Mets’ 0.435. That’s because most of those teams were both bad and unlucky. So regression to the mean is nothing more than the observation that teams with good records probably aren’t as good as they look, and teams with bad records aren’t as bad as the look. All teams are expected to regress in their remaining games towards the mean, which is 0.500.

So now let’s look at the last 99 games. If the Braves were really a 66.7% team, they’re going to win 66 of them, just another way of arriving at 108 wins. But we need to make an adjustment to reflect the fact that they’ve probably been really lucky. As it happens, the way you make that adjustment (I discussed this earlier in the season) is to add 74 games of 0.500 ball to show your expected real skill. This moves them from a 66.7% team to a 57.7% team in expectation. Thay have about 50-50 chance over the rest of the season of finishing better than that (more than 99 wins) or worse than than (less than 99 wins). This is a reasonably precise estimate of just how much information is currently embodied in their 42-21 record.

So what do the remaining 99 games look like? I’ve made a few changes, which estimates every team’s skill and then simulates the season forward 500,000 times to show the distribution of Atlanta wins in the next 99 games. Here’s the result:

Note that that’s an expected 56 wins, not the very gaudy 66 implied by 42-21. But look at the graph… 66 remaining wins is still a possibility… around a 1% possibility.

So what are the probabilities of winning the East at this point? That answer will come tomorrow. That’s enough math for one day and we can turn to baseball.

The All-Star Game

[RANT]

By now everyone knows how indifferent I am to the concept of the All-Star game and how much I loathe the exhortations to vote for your favorites and waste their mid-season rest period. In 2023 the Braves set the record for All-Stars, and when they won the World Series that year, it proved worthwh… Oh, what? They tuckered out in the playoffs? How could that be?

Now in the 70s and mid-80’s when we were terrible, one could take some interest in which of your players either earned an appearance or was gifted one as a consolation prize for not having any good players. And of course before interleague play there was some genuine interest in watching players who didn’t play each other play. And when people couldn’t see stars from other teams very often I guess there some argument for the game. But the game serves no point now. Vote for All-Stars and then give them each $50,000. Hold the Home Run Derby if you like… maybe a Futures Game of Minor League up-and-comers…. Maybe do like Pro Football (which has substituted flag football for football) and have them compete in an MLB:The Show competition with everyone in his own living room. I’m trying to be constructive here, but I assume my unalloyed contempt manages to sneak through.

So the voting season is upon us. It won’t do any good to vote for Phillies, since they won’t have to travel to the game anyway. Vote for Dodgers. All of them.

[/RANT]

The Game

It was Hawaiian Beach night at the ballpark, and most of the fans got lei’d.

Coming into tonight, Pittsburgh had scored only four fewer runs thatn the Braves, but they’d given up 68 more. This makes them a better version of the Nationals. The reason they’re so much better than the Nationals is that they have starting pitchers like Paul Skenes, who won’t face the Braves in this series. Tonight’s lanzador, is the longest-suffering current Pirate, Mitch Keller. Keller is a journeyman who hasn’t journeyed past the banks of the Allegheny. He’s been a Pirate for 8 years, and he’s never played for a 0.500 team. two teams lost over 100 games and two other team lost over 90 games. But they’re over 0.500 now and the Pirates have a reasonable chance of actually winning more than they lose this year.

The Braves took the lead on a two-out single from Ronald Acuña Jr. in the 2nd, but the Pirates reversed things on a sac fly, an infield single and a bloop single from Marcell Ozuna off Martin Perez in the 3rd to make it 3-1. The Pirates have scored a lot of runs despite Ozuna’s batting average south of the Mendoza line. (Swapping Ozuna for Dom Smith has proved a major upgrade.) Still, it was good to see him again and he got a nice ovation from the assembled multitudes.

Mauricio Dubón‘s third homer in 3 days tied it up at 3 in the bottom of the inning. He came back up in the fifth and gave the Braves the lead with a single misplayed into a double. Dom Smith brought home another run with a sac fly. Austin Riley then doubled to bring the lead to 3. Keller had 99 problems… umm…. pitches and left having given up 6 runs.

Didier Fuentes took over in the 6th and threw a nine-pitch inning. There is presumably some theory why he was not brought back out in the 7th, but I’m not sure I know what it is. Instead, Walt Weiss followed the “we’re ahead” playbook, giving Driggy (for those not following the comments, that’s Dylan-Robert-Iggy) for the last three innings. Lee used 12 pitches. Suarez needed 10 pitches. Iggy started walk/single (to Ozuna) but struck out Oneil Cruz and got Ryan O’Hearn to rap into a slickly fielded double play to end the game. 19 pitches. He’s clearly losing it.

Former Attorney General Braxton Ashcraft against Former King of Arnor and Gondor Spencer Strider tomorrow at the 4:15 thunderstorm hour.