And the Best and Worst VFL/AFL Teams of All-Time Are ...

I've promised for a while now to create MARS Ratings for the entire history of the VFL/AFL and with some time away from home over Easter, I'm now able to deliver on that promise.
I've promised for a while now to create MARS Ratings for the entire history of the VFL/AFL and with some time away from home over Easter, I'm now able to deliver on that promise.
Sometimes it's interesting to see where an idea takes you. This blog started out as a wet weekend's musing: how predictive might a margin predictor be if it was based on a single variable?
It's unarguable that the winner of some games will be harder to predict than the winner of others. When genuine equal-favourites meet, for example, you've only a 50:50 chance of picking the winner, but you can give yourself a 90% chances of being right when a team with a 90% probability of victory meets a team with only a 10% chance. The nearer to equal-favouritism the two teams are, the more difficult the winner is to predict, and the further away we are from this situation the easier the game is to predict.
In the previous blog I used a clustering algorithm - Partitioning Around Medoids (PAM) as it happens - to group games that were similar in terms of pre-game TAB Bookmaker odds, the teams' MARS Ratings, and whether or not the game was an Interstate clash. There it turned out that, even though I'd clustered using only pre-game data, the resulting clusters were highly differentiated with respect to the line betting success rates of the Home teams in each cluster.
For today's blog I'll be creating a game clustering that uses as input only the information that we might reasonably know pre-game - for example, the pre-game team MARS Ratings, Bookmaker prices (or some metric derived from them), and information about the game venue.