log5 is a method of win estimation which takes two teams’ (A and B) win probabilities against .500 teams (can be calculated many ways, but the way I saw it was Pythagorean Record—but I’m sure something else could be used) and returns the win probability of A beating B. It uses odds ratios.
A note: I am going to be quite inconsistent with verbiage in this article, but when I say the words “true talent” in relation to team winning percentages, I am almost certainly referring to a team’s true talent against a .500 team. This definition is somewhat circular (because “.500 team” is defined by their true talent against a .500 team) but it doesn’t really matter.
Formula
Here are two equivalent forms of the formula, where is the winning percentage of A in a large sample (where opponent quality averages out to .500), is the same for team B, and is the probability team A beats team B:
Derivation
There are two ways of deriving this formula: using general probability theory, and using the Bradley–Terry Model. For all I know, these two methods are equivalent, but they’re two different ways of looking at the formula.
To start, we can define some key terminology which exists in both methods:
w(X) — the winning percentage of team X in a sufficiently large (ideally infinite) sample where the True Talent of the teams they face equals a .500 winning percentage. P(X > Y) — the probability of team X winning against another team Y
Another thing which is used in both methods (but in a different way) is the idea of a proxy matchup, where team A and team B play against an imaginary team C. This imaginary team C, in both methods, is a team with a True Talent of a .500 winning percentage.
Method 1: Probability Theory
This method uses the proxy series to its fullest extent in order to estimate P(A > B). I do not claim to have invented this method. While eventually I figured this method out on my own, my understanding of it was helped by this article from John Richards for SABR.
Essentially, this methodology is simply making this assumption: If A wins against C and B loses against C, A would win against B. Then, from there, if you have the probabilities P(A > C) and P(B > C), you can calculate the probability of A winning and B losing, which we can assume (with adjustment for only including the valid outcomes in the sample space) is equivalent to P(A > B). This is actually quite simple:
Assuming (some limitations to this assumption are provided below in the limitations section) we have a and with a sufficiently large sample and average quality of competition, and , giving us
However, in this formula, we have made an assumption: that the probability of some Team X beating Team C (with a true talent of a .500 winning percent) is equal to the percent of games Team X wins in a sufficiently large sample against many teams who average out to having a True Talent of a .500 win percentage. This assumption largely works, but the limitations section describes some edge cases where it does not. Hopefully, merely intuitively, you can see why you need to be careful here and make sure that a group of teams averaging to being average produces the same result as a single average team.
Essentially, what this is doing is dividing our desired outcome (A wins and B loses—which can simply be multiplied to find the probability because these two are independent events) by the total sample space (either A wins and B loses or A loses and B wins). If any outcome outside the sample space (eg both A and B win or vice versa) you can simply discard that outcome, as if the teams will then be forced to play another round to alleviate the tie. And that gives us our formula for log5!
The way to evaluate whether the assumption of the proxy series being equivalent to a head to head series is valid is relatively straightforward but tedious. If the assumption holds true, the formula will hold true and match real world results. And if the assumption does not hold true, the formula will not hold true. You can Google matrices of true win probabilities between teams in various brackets and compare the results to the formula (or read the article I linked above from SABR).
Method 2: Bradley-Terry Model
Another model that can be used to prove the log5 formula is the Bradley-Terry Model. The Bradley-Terry Model says this:
Where is the “skill” of team A and is the skill of team B. In order to use this formula, we need to complete the following steps:
- Determine the skill of team A given their large sample winning percentage
- Determine the skill of team B given their large sample winning percentage
- Finally, put these two skills into the Bradley-Terry formula to determine
In order to complete the first two steps, we can actually use the Bradley-Terry formula itself. First, we formulate an imaginary team, C, with . That team has a skill of , an arbitrary value which is the baseline skill for the league. Next we have to make the same simplifying assumption as in method 1: that . From this assumption, we get the following equation through the Bradley-Terry Model formula for teams A and C:
Rearranging for , our unknown value that we care about:
A similar formula can be found for which solves step 2. Finally, putting everything into the Bradley-Terry Model formula a second time:
Then, after canceling the arbitrary constant and rearranging the formula, the following formula is found:
This formula is equivalent to the formula found in the first derivation method.
Limitations and assumptions
Due to the difference between performance against a sample of teams who average to .500 True Talent and performance against a .500 team (as discussed in Pythagorean Record and also by Phil Birnbaum), this formula may not work perfectly. The specific example discussed by Birnbaum is a game in which winning is deterministic based on competitor height—the taller competitor wins. Assuming an infinite sample size of matches played, a player with a .600 win percentage will beat a player with a .400 win percentage 100% of the time. So, this formula relies on randomness existing in the probability in order for the simplifying assumption that to work—that is, playing against a .500 team must be roughly equivalent to playing against a bunch of teams averaging to .500.
A similar example where the formula breaks down, even in a sport where the assumption of large samples generally works like baseball, is when teams are closer to the extreme winning percents (close to 1 or 0). If a Team A plays an equal number of games against two teams, B and C who have a mean true talent against a .500 team of .500, you might assume that the percent of the time that Team A wins in this sample will be indicative of their true talent against a .500 team. However, this is not the case
In addition to that, it underestimates the win probability of a team that is much better than its opponent. TangoTiger performed a similar (and more rigorous) analysis which had similar conclusions.
Extensions
There is an extension of the Log5 formula (which I have not yet fully read through) which attempts to extend Log5 to batter–pitcher matchups: https://sabr.org/journal/article/matchup-probabilities-in-major-league-baseball/
This formula can also be extended to account for home field advantage.
Furthermore, it can be modified to account for the fact that better teams face a slightly lower quality of competition on average (because they’re hogging all the wins) and vice versa.
Verifying the results of log5
In addition to an empirical verification of the results of log5, we can actually figure out whether the formula works in theory. Tom Tango determined the standard deviation of winning in baseball to be around (though of course this is from 2006). From this, we can calculate a normal distribution for the true talent level of teams in MLB: . We can also create a function which finds the probability that team A beats a team B with a winning percentage of . Finally, if we hold constant and calculate the integral , that would be the predicted winning percentage of team A by the log5 formula. For example, when I set , the integral outputs an expected win percentage of 0.797. As it turns out, Phil Birnbaum has found that the formula gets more accurate as the spread of talent in a league gets lower. Baseball has a small spread of team winning talent, hence why this formula works incredibly well with baseball.