MAFL Online - Probability, Stats and AFL Footy - Statistical Analyses - Ensemble Models for Predicting Binary Events

Friday

Jan212011

Ensemble Models for Predicting Binary Events

Friday, January 21, 2011 at 11:24PM

I've been following the development of prediction markets with considerable interest over the past few years. These are markets in which the opinions of many engaged experts are combined, the notion being that their combined opinion will be a better predictor of a future outcome than the opinion of any one of them. It's a notion that has proved right on many occasions.

One relatively simple circumstance in which it's possible to provide a mathematical proof of the 'wisdom of crowds' is where we have a number of judges, each making a binary prediction - for example, that team A wins or loses - and we combine those judges' individual votes to come up with a final prediction. The Condorcet Jury Theorem tells us that if each judge has a probability of being correct that is greater than 1/2 and if every judge's vote is statistically independent of all other judge's, then the more judges we have the greater is the probability that their combined vote with be correct. In fact, as the number of judges increases, the probability of their combined prediction being correct tends monotonically to 1.

Now let's get practical and think about how we might apply this result to the challenge of predicting line winners.

Finding a large - preferably infinite - number of footy 'judges', each of whom is guaranteed to tip line winners at a rate in excess of 50%, is problematic. In practice, the real question is: how many judges do we need such that, when combined, their predictions are correct more than about 53% of the time, which is the success rate that we need to overcome the vig on the $1.90 market price in line betting?

To answer this, it turns out, we need to know two things about the judges:

How accurate are they each individually (I'll assume they all have the same accuracy rate)?
How often do they both tip correctly? (I'll also assume that this is the same figure for all pairs of judges)

Armed with those assumptions, it's not hard to simulate the combined performance of the ensemble of judges, which is what I've done to produce the following chart.

(As usual, please click on the chart for a larger version).

Each panel in the chart corresponds to a particular number of judges and a particular accuracy rate for each judge. So, for example, the top left panel, which is labelled (3, 0.510) is the chart for 3 judges each of whom tips at a 51% accuracy rate. Within each panel, the rate at which any pair of tipsters co-predict winners runs from left to right (it's the axis labelled 'Both') and the overall accuracy of the ensemble runs from top to bottom (it's the axis labelled 'Rate'). Every point on each chart is based on 200,000 simulated predictions.

In general, an ensemble is more accurate if:

it contains more judges
each judge is more accurate
any pair of judges is less likely to co-predict a winner (statistically, this means that the less correlated are the predictions of any pair of judges, the better is the ensemble. It's the power of asset diversification in prediction form.)

It is possible to build a profitable ensemble of line-result predicting models with only 3 underlying models, but they'd each need to be correct 52.5% of the time or more, and the rate at which they co-predicted winners would need to be less than 35%.

If we could find as many as 11 models with the requisite characteristics, we could profit with an accuracy rate for each model of 'just' 52% and a winner co-prediction rate of less than 35%.

How might one go about creating such an ensemble of models? One approach would be to come up with as many variables as you can think of that could be predictive of line betting results and then build a number of models each using only a subset of those variables. The subsetting should reduce the winner co-prediction rate across models, though ensuring that each resultant model had an acceptably high accuracy rate is likely to be a challenge.

Nonetheless, it's an interesting approach ... maybe next year.

TonyC |

2 Comments |

Reader Comments (2)

Not sure why you need models when you could just use the list of tipsters in The Age!

Also, I think we missed an opportunity: http://www.theage.com.au/afl/afl-news/bookies-banned-from-betting-on-first-coaching-casualty-20110123-1a1ah.html

If they're banning it it must have been profitable!

January 24, 2011 |

Mitch

I suspect that the tipsters in The Age, as a group, fail on two counts:
(1) Insufficiently high accuracy rate
(2) Excessive correlation between their tips

But it does suggest an interesting exercise for the forthcoming season: use the first (say) 6 weeks to identify the most accurate and least correlated set of N tipsters (for N odd) then use their majority vote to tip line results for the remainder of the season. Consider it done.

I did notice the story re the lack of a market for first coaching casualty. As I understand it, no market had been framed (at least formally),so I consider it more of a pre-emptive strike by the AFL. It's a textbook case of a market where there could be well-informed insiders in a position to profit from unannounced changes. Also, it'd be terribly destabilising for a team if its coach were, say, $1.10 for the chop in the next 2 weeks.

January 24, 2011 |

TonyC

Post a New Comment

Enter your information below to add a new comment.

My response is on my own website »

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>