As the world hurtles forward, we’ve uncovered more information about almost everything. We know more about what people are, what they do, what they say and even what they think than ever before. And not just by a little bit – by orders of magnitude. We are well and truly into the realms of Big Data.
The football world (and fantasy in particular) has dived headfirst into this. The NFL has produced advanced stats, Pro Football Focus has turned into a juggernaut and the whole fantasy world can spout numbers like yards per carry and sack rate and yards created and air yards on demand.
There is more data and more information than ever before.
And this can only be a good thing, surely? Football is complex. There are 22 players on the field all doing different things, the coaches are involved on a play-by-play basis while deception and misdirection are major parts of the game. Complex problems need complex solutions which is why we as a community have busily developed advanced metrics to better describe and predict what is happening.
Which brings us to a science called heuristics. Heuristics claims that actually complex data can often not be the best solution to complex problems. In fact, sometimes simplicity gives a better and more useful answer.
An example is a returner fielding a punt. Calculating trajectories of a moving ball is hard. You need to account for the speed it’s travelling, the angle, the spin, the wind conditions and several other factors. Mathematically it’s a nightmare. But every week we see returners unerringly underneath the ball. They’re not out there doing difficult math as the ball falls. They use a simple heuristic.
If you run towards a falling ball and ensure the angle of the ball remains the same, you’ll arrive at the same time as the ball. We learn to do this unconsciously when we’re kids and do it automatically. Heuristics enable us to do complicated things simply.
This is due to a number of factors but one of the most common is overfitting. This is a phenomenon that arises when we try so hard to model data that our model becomes less useful as a predictive tool. We build models that work on so many variables that they become difficult to use. Unwieldy and overly complicated. They take a huge amount of time to build and maintain and lose their usefulness.
In this writer’s own favored field, this is definitely a problem. Modelling defensive players requires a few basic bits of information. You need to know how much a player will play, what position he plays, how he’s used (how much time does he rush the passer or drop into coverage), what sort of scheme he plays in, etc. It does a good job and is useful in the off-season to give a good idea of which players are likely to be productive in the coming season.
But that’s not quite enough.