I’m always looking to get as much data as possible. Currently I’m collecting data from around 20 men’s tours, with plans to include many mini tours and top amateur events. In the past I’ve also collected data from women’s professional tours, but currently the focus is just on the men’s game.
I’ve got tournament scores going back to the late 1980s, but in the present database I’ve started from the year 2000. After all, who needs to know what Lee Westwood shot at the One 2 One British Masters in 1998?
There’s a few tournaments in the database before 2000, but it’s not relevant for producing my output. What a player scored on a course 15+ years ago is of no use to me when compiling odds.
Types of Data
The main piece of data is the scores – everything else is secondary to this. It’s the scores that tell you how well a player has played overall, not individual metrics like Strokes Gained Off the Tee.
Don’t get me wrong, I am all in favour of more detailed statistics that you can dive into to give you a better understanding of how a player has compiled his score. However, it’s the overall score each competitor signs for that I use as inputs to my model, so that’s what I concentrate on.
So I do record statistics from the main tours, but they’re used as a reference along with my output rather than as an input into my model.
The output I provide is what I find useful. I’m always thinking of ways to create more useful reports and look at the data in new (and hopefully useful) ways.
Anything I find useful I’ll add into the Excel download I create for each tournament.
As touched on above, I’ll be looking to add top amateur and mini tour events to get a better view of players when they join the higher quality tours.