powered bybetstamp
Menu

Using correlation to find good baseball bets

Correlate Good Times.png

Advantage gamblers are a lot like magicians, in that our livelihoods depend on our secrets staying secret. To celebrate the launch of The Hammer, I’m going to break the unwritten rule and show you something that I can personally vouch for as having been quite successful.

When evaluating a betting strategy, you generally want to have two pieces: a narrative that makes logical sense to explain why your bets would be good, and the hard data to back that narrative up. Narratives without data are intuitively appealing but almost always losers. Data without narratives results in over-fit nonsense like “the Yankees have won 17 of their last 19 home night games played on weekends under a full moon” with no predictive value at all. Narrative + data = potential.

We’re going to be dealing with parlays of money lines and totals for the same baseball game. Different books have different rules for these types of baseball parlays;

  • Some books do not allow them at all; 
  • Some books allow them but price them using their same game parlay product – STAY AWAY from these; 
  • Some books allow them and price them as regular parlays – these are the ones we want to find and use for this angle. They’re not in the majority but they do exist, including some of the largest American, Canadian and European books.

Our narrative is as follows:

When the visiting team wins, 9 full innings are played. When the home team wins, most of the time only 8.5 innings are played because there’s no bottom of the 9th. So there’s more baseball played, and hence, more opportunity for runs to be scored when the visiting team wins. Therefore, Visitor + Over and Home + Under are correlated parlays.

Before we continue, let’s take a brief time-out to explain what correlated parlays are and why we like them. The payout odds of a regular 2-leg parlay are calculated as follows:

Parlay odds (in decimal format) = Leg 1 odds (in decimal format) x Leg 2 odds (in decimal format)

This makes sense because, if the two legs are independent of each other, the win probability of the parlay follows a similar equation:

Parlay win probability (if legs are independent) = Leg 1 win probability x Leg 2 win probability

But what if the two legs aren’t independent of each other? What if they’re correlated, meaning that one leg winning increases the likelihood of the other leg winning, like the Penguins beating the Senators and Sidney Crosby scoring more points than Daniel Alfredsson. Then,

Parlay win probability (if legs are correlated) > Leg 1 win probability x Leg 2 win probability

So, a correlated parlay has a win probability that is greater than the one that’s assumed by the payout odds you’re getting. It’s a bonus that boosts your win probability for free. Do we have a legitimate one here? Let’s go to the data and find out.

Our data set contains every Major League Baseball regular season game played from 2010 to 2021, a total of 27,625 games. (Baseball is great for analytics because there are so many games, it doesn’t take long to build up a large sample size!)

Visiting teams on the money line: 12,814 wins, 14,811 losses for a 46.39% win percentage.

Overs (excluding pushes): 12,947 wins, 13,315 losses for a 49.30% win percentage.

Now let’s check for correlation. If there’s independence, then the win percentage for the parlay of visiting teams AND overs should be 46.39% x 49.30% = 22.87%. A win percentage higher than 22.87% would indicate correlation.

Visitor + over parlay (excluding pushes): 6,268 wins, 19,994 losses for a 23.87% win percentage. 

We can calculate a correlation factor as the ratio of the 23.87% divided by the 22.87% = 1.0437, this means that the value provided by the correlation is worth a bonus of 4.37% of our bet amount. 

Home teams on the money line: 14,811 wins, 12,814 losses for a 53.61% win percentage.

Unders (excluding pushes): 13,315 wins, 12,947 losses for a 50.70% win percentage.

If there’s independence, then the win percentage for the parlay of home teams AND unders should be 53.61% x 50.70% = 27.18%. A win percentage higher than 27.18% would indicate correlation.

Home + under parlay (excluding pushes):  7,402 wins, 18,860 losses for a 28.19% win percentage.

Correlation factor = 28.19% / 27.18% = 1.0372, this means that the value provided by the correlation is worth a bonus of 3.72% of our bet amount.

With these large sample sizes, we can conclude that we’ve found significant correlation in both visitor + over and home + under parlays. So does that mean we can bet them blindly and rake in the cash? Unfortunately not. To understand why not, we have to think about our starting point before the correlation bonus gets applied. The house’s theoretical hold on a standard -110 bet is 4.55%. On a parlay, that hold compounds with each leg – a 2 leg parlay holds 8.88%. If you blindly bet 2 leg parlays, your long-run expectation would be a loss of 8.88% of your bet amount. With the correlation bonus added on top of that, you’re losing less but you’re still losing.

To summarize where we are at this point, we’ve found a nice correlation but it’s not enough to overcome the house’s hold. To truly make this thing click, we need to enlarge the correlation or we need to shrink the hold…or both.

The correlation factors calculated above, 1.0437 for V+O and 1.0372 for H+U, are averages for the entire set of games. If we can segment the games by finding a rule that splits them into some that show higher correlation and some that show lower correlation, we can bet the higher ones and pass on the lower ones. 

Let’s look at it by the home team money line odds:

Home team money line odds

V + O Correlation Factor

H + U Correlation Factor

Lower than -200

0.9319

0.9716

-200 to -151

1.0091

1.0056

-150 to -101

1.0467

1.0393

+100 to +149

1.0787

1.0942

Higher than +150

1.0673

1.1259

Total

1.0437

1.0368

Now we’re getting somewhere. Games with home dogs have much stronger correlation, especially on H + U. This makes intuitive sense, because a game with the visitor favoured would have a much higher likelihood of a bottom of the 9th priced into its total, making the under that much better when there isn’t one. By narrowing our focus to H + U on games with a home dog, we’ve supercharged our correlation bonus.

Now let’s work on that pesky 8.88% hold. The easiest way to lower the hold is to use multiple books and shop for the best lines. It’s not quite that simple here because there’s only a limited selection of books that allow this parlay, so you usually won’t be able to find a hold that’s zero or negative…but thanks to the correlation bonus, you don’t have to! Even knocking that 8.88% down by two or three percentage points is helpful. 

For example, Pinnacle is a sharp book that charges reduced vig – instead of -110, they price their MLB money lines to -105 and their MLB totals to -106. This means that the hold on a money line + total parlay at Pinnacle is only 5.14% instead of the standard 8.88%. While Pinnacle itself doesn’t allow money line + total parlays on the same game, it’s not uncommon to find a situation like this:

Pinnacle: Yankees -130 @ Red Sox +120

Your Book: Yankees -140 @ Red Sox +120

Pinnacle: Total 8.5 Over -106 / Under -106

Your Book: Total 8.5 Over -114 / Under -106

By taking bets at your book that have the same price as Pinnacle, you can be confident that you’re paying 5.14% hold instead of 8.88%. Even if your prices are a cent or two worse than Pinnacle, you’ll end up with a hold around 6% that is comfortably overcome by the correlation bonus of 9.42% (from 1.0942 correlation factor in the above table) for H+U on home dogs between +100 and +149. You may even be able, with enough searching, to sometimes find prices at your book that are a cent or two better than Pinnacle on some of these!

One last note…I’ve titled this article “how to find good bets”, not “how to find winning bets”. I define good bets as bets with positive long-term expected value (in short hand, +EV). Bets with +EV are not guaranteed winners; we’ve seen from the data above that they actually lose around ¾ of the time. +EV means that over a large number of bets, you are likely to have your dollars won exceed your dollars lost…but on any given day/week/month/year, randomness can be cruel and there are no guarantees. So please, bet wisely but also bet responsibly and don’t risk more than you can afford to lose.

Good luck…and hopefully we have a correlation to last throughout the years!