As we enter the fantasy football off-season, the dynasty community shifts its attention to the upcoming NFL Combine the end of February, and the draft in April. For people particularly into scouting and the numbers side of things, they may find themselves rather awkwardly spending the regular season looking forward to this time. All the football, none of the weekly disappointment! The running backs on your roster might even go a week without injuring their knees and hamstrings.
While the Combine provides us with a wealth of new data, anyone who studies player metrics will inevitably face a troubling fact: they don’t seem very good at predicting success. This is at odds with the common-sense idea that physical skills obviously matter for being a good football player (Christine Michael notwithstanding).
Given that, I’m going to try to explain how it’s possible that physical metrics DO matter even though the data doesn’t really show it. I’ll need to use a bit of math and statistics to do so, but I’ll try to keep it simple and minimize the jargon.
Models and Omitted Variable Bias
[am4show have=’g1;’ guest_error=’sub_message’ user_error=’sub_message’ ]
Most data analysis involves creating an underlying model. That is, an assumption about the way we think this data works in order to estimate some results. To use a common example, if you just create a simple graph with a metric on the vertical axis (e.g. SPARQ, BAS3, or a single measure like 40 time) and a measure of NFL success on the other (e.g. total fantasy points, ADP, number of years in the top ten of their position) then plot a best fit line, the model that you’re implicitly assuming is:
<success> = a + (b*<metric>)
Those of you who recall high school algebra might recognize it as the slope-intercept form of a line, often written something like y = mx+b or y = a + bx, where b is the slope of the best fit line and a is where it intercepts the y-axis.
Or to put the math in simple English: “If I change my metric by one, how much does success change by?”
If you were to look at such a plot, you would almost certainly notice that the best fit line isn’t really a very good fit. It may even slope the wrong way, suggesting that an increase in your metric leads to a decrease in NFL success. A big part of the problem you’re running into is omitted variable bias.
Omitted variable bias comes about when we create a model like the one above, but are missing loads of explanatory variables. NFL success isn’t just a function of metrics; it’s a function of metrics, and skill, and work ethic, and off-field problems, and injuries, and scheme, and luck (with a lower-case l, not the Andrew variety), and Luck (the Andrew variety, aka quality teammates), and being in the NFC East, and many more. Many of them depend on each other, and many of them are very hard, or even impossible, to measure. To get around this messy and missing data, we’re going to create a simplified simulation and fill it with some randomly generated players.
To create our simulated world, we’ll follow a series of steps. First, we’ll decide the exact formula for success, second, we’ll create a batch of players by giving them random attributes, and third we’ll use the first two steps calculate their level of success.
Each of our players will have eight attributes, which we’ll call “a” through “h”. For simplicity there are no other attributes, no error, and nothing is unmeasured. All of those attributes together completely determine success, according to this formula I made up:
success = .3*a + .2*b – .1*c + .7*d + .4*e -.2*f + .8*g + .4*h
The numbers in there are called coefficients, but really they’re just telling us how much success changes when we change an attribute by one. Hopefully a quick look at it will show you that if you raise “e” by 1, success rises by .4. Raise “f” by 1, success falls by .2.
Now that our simulated world has a structure, we have to create players. To do so I’m going to generate some random values from a normal distribution, where the mean is 0 and the standard deviation is 1. To put it in simple terms, most of the players will be near the average in each value, with progressively less and less players being very good or very bad. This type of number can be described in shorthand as N(0,1), which I’ll use below, and can be calculated in Excel (for those of you following along at home) by the formula: =NORM.INV(RAND(),0,1). Our world has eight attributes; I’m going to create the first six, “a” through “f” this way.
The last two attributes, “g” and “h”, we will create as interactions of the other attributes. Imagine, for example, that “a” is physical metrics and “b” is work ethic. Both of those obviously matter, on their own, for NFL success. However, what also matters is the combination of the two, or how they feed back into each other. For example, if you’re physically gifted and have a high work ethic, that matters to success above and beyond each one alone. However, if you have a bad work ethic then being physically gifted actually makes things even worse. Such a player may have never had to work hard for success in their life, because their physical skills always made it easy. This is called an interaction term, which here we’ll simply model as a * b = g and c * d = h. You can probably imagine all sorts of real-world interaction terms.
We then plug the values for those attributes into our success formula to determine a value for how each player does. An example player, then, will look like this:
Keep in mind as we’re doing this that my numbers and structure are mostly arbitrary; there are countless ways this could be done that will change the results. My goal is to use a relatively simple design to illustrate omitted variable bias.
Now we’ll go ahead and create fifty such players this way and then pretend some simulated scientists are studying this world to figure out how to predict NFL player success.
For our first study, our scientists don’t know the coefficients that were used to create the world, because all they see for each player are their attributes, “a” through “h”, and their success. However, they DO know that success is a formula of only those eight things, and they have measurements for them.
We’re going to use a simple method called a regression, to try to find the relationship between the attribute variables on the right and the success variable on the left. I won’t go into the details of what this entails, or what it assumes, but suffice to say that with our simulated data it would exactly return all of those coefficients we used to create the players, the p-values (the measures of significance of our results) would all be effectively zero (lower means it’s more confidently estimated), and the R-squared would be 1 (meaning the entire variance in success is captured; 0 means none of it is). In simple English, if you accurately model the entirety of the way the simulated world works, and accurately measure all the determining variables “a” through “h”, you can perfectly estimate success.
But that’s not very realistic, so now let’s handicap our scientists. Much like us when we make a simple plot with success on one axis and physical metrics on the other, our scientists will only have measurements for success and for attribute “a”. They either don’t know that “b” through “h” matter, or they don’t have a way to measure them. Can our simulated scientists still figure out how “a” matters to success? Let’s take a look at a plot for the data I created:
Well that looks terrible. What might our scientists conclude from this about the relationship between attribute “a” and success? The obvious answer is that attribute “a” is essentially irrelevant. In fact, if they run a regression using this model they would find the p-value to be insignificant and the R-squared to be very, very close to zero. But we know the underlying truth of this simulated world because we created it, and we know that attribute “a” is a determining factor for success.
Our simulated scientists are suffering from omitted variable bias, just like we do in the real world. Because they don’t know all the other factors that go into success, they can’t come very close to estimating the effect of the one they do know.
The Moral of the Story
My little simulation was created specifically to illustrate omitted variable bias. You can try changing it around, and you’ll find very different results. You may even find a relationship between success and a single attribute, if the simulation you create is very simple or that attribute determines a large enough part of success relative to the others. If attribute “a” determines 90% of success and attributes “b” through “h” determine the remaining 10%, then we can very likely get close to the truth even without the others. The degree to which a plot may be deceptive will generally be proportional to what’s missing from the explanation.
This is important, of course, because in the real world the number of things that determine an NFL player’s success is enormous. Not only do we not know what all of them are, many of the ones we suspect to matter either aren’t measurable or aren’t comparable between players. Physical metrics are easy to do both with, which is likely why we are constantly talking about them. That is, we can measure bench press and can clearly say 20 reps are better than 15 reps, and by exactly how much. But what about work ethic, or the drive to win, or the strength of a player’s ACL? Even if you could estimate them somehow (e.g. “this player’s work ethic is high”), how would you put them to numbers and compare them (e.g. “this player’s work ethic is 20, and this other player’s is 15.”)?
The fact that we can’t come up with a full measure of success means that unraveling the relationship between physical metrics and success is a difficult task. I think it’s fairly obvious that being faster or more agile or stronger is better, and given the attention the Combine receives I think most would agree. We just have to be very careful when we’re attempting to quantify that advantage.
As with our simulated scientists, simple two-variable graphs are one of the very common ways we deceive ourselves in this sort of analysis. If you take away only one thing from this article, let it be that the first thing that comes to mind the next time you see a graph like that is suspicion, followed by “what is missing from this?”
- The Five Rules of Dynasty Trading - August 30, 2016
- Not So Fast There, Average - February 29, 2016
- Why Don’t Combine Metrics Predict Success? - January 21, 2016
I think you might have lost a few people with this; although it does help to validate that economics is more theory than science.
That aside; I have found more success by inserting “O” into the equation, so much so it is the only factor that I use. It helped me pick up players like Allen Robinson and AS-J when nobody wanted to draft them, it stands for Opportunity. If a rookie doesn’t get drafted onto a team because of need then he likely won’t get the opportunity to be successful. Must say my success has been better than most the other managers in taking this approach.
Thanks for the article.
Hi, thanks for the reply. It’s definitely a bit of a technical article; simplifying these things is difficult and risky. Hopefully the message comes across even if it’s not all clear.
I’m not sure what you mean by “more theory than science” though. Theory is a core part of the scientific method, and every scientific field relies upon it.
Outstanding stuff. I appreciate that you trust in the intelligence of the DLF readership to be able to follow along.
There is no sport as specialized and regimented as football…I played it and love it, but it’s a strange beast. Measuring a player’s talent and potential for success is something that flummoxes high paid professionals – all we can do as fantasy owners is take on as much information as we can and cross fingers.
Thanks, glad you liked it!
One point I would stress from the article isn’t that answering questions like this is impossible. There are loads of ways to cope with it, including more advanced statistical methods, or even simply trying to include more things than just metrics in your analysis (in terms of the article, if you could include “a” and “b” instead of just “a”, you wouldn’t eliminate the problem but you would get better results).
Any chance that you at DLF would attempt to create such a multi-variable equation by assigning variables and coefficients based on historical data? Many people disagree, but I’m a believer that even an imperfect model can progress our knowledge. Or put another way, even in failing you may educate some.
I will be the first to admit that I have a social science and law background and not math or economics, but with that being said, I have no idea what I just read. I think this is a fun exercise and (I think?) the ultimate conclusion is that you cannot graph and plot your way to picking who is going to have success in thr NFL. It is why not only are the player personnel very smart guys, but also guys who have been in football all their life. These players are some of the most skilled athletes on the planet who have honed into a specific task and it’s nuance over their lifetimes. I think if anything, the combine and measurables are something that is cross referenced with what the players have already shown on the field to say when you put these two together, their tape plus physical measureables, how can we project where that will put them in comparison to current players.
I think many fans and fantasy diehards latch hard onto measureables and things like work ethic and “off-field issues” because they are easy to identify and sum up and synthesize to an outsider. As someone who played football my entire life I will say that in the public we do not have the expertise nor the access to the correct film (team utilized endzone cams showing the players every movement closeup on every play and every practice rep) to break down their actual play. So in an effort to break down what we have we overly fall into the trap of overvaluing the metrics and news we do have. This is totally understandable but I think it leads to the overvaluation of a lot of mediocre role players and the undervaluation of some damn good football players. All we can try to do is sort it all out and be right more often than wrong. I respect you for trying to measure it in such a complex manner, I just don’t know if I see the utility here other than (to me) to illustrate you cannot graph or make an equation with all these differing factors such as size, speed, agility, team opportunity, hands, intelligence, play speed, football intelligence, work ethic, and character. Those factors are all related but also so disparate in nature in some ways. Further, with the opportunity piece, some coaches will get much more out of a player than if they had landed somewhere else. Sometimes this can be the difference between a valuable player and a guy who doesn’t turn out. Ultimately, the fun of us as owners to try our own systems such as these on the most complex team sport there is is what makes fantasy football and dynasty so much fun.
Okay, I just read it again and fully got to your conclusion. I will admit the equations kind of tuned me out the first time. Interesting way to state this point and not one I had seen or considered in a terms of a mathematical interaction before. Life is always balanced in a mathematical way of course if you can measure it, but I guess with football players I do this type of calculation intuitively and never considered thinking about it in this way. More receptive to the thesis and points made on the second read through!
I created a forum thread for discussion, and I tried a few simple examples to illustrate the point a bit more clearly.
Great Article Jeff!
I have a bit of background in math, statistics, and visualization so I understood what was being presented.
As a suggestion, perhaps you could break this up into a series of articles that breaks things down into smaller steps.
For example, it could be a series of articles on model building using some actual combine data starting with multi-variable (linear) regression, interaction, incorporation of non-linear terms, examining collinearity, etc. These are complex things but if the focus is on visualization and drawing some simple inferrences, I think you could get the point across. Otherwise, an “analytics corner” could help enrich the content of the site.
I myself would like to do some of these things, the trouble I have is finding data and time!
Glad you liked the article! I’ve considered several times making a series of articles on analysis, but no one wants to turn this into a statistics class, and I haven’t come up with a way to make it without doing that. It’s like a rabbit hole: once I explain one thing, it inevitably begs for another thing to be explained, and another, until we’re back to it being just a stats class.
I do have a few more articles in this vein in mind that I’ll hopefully work on soon though.