Not So Fast There, Average

Jeffrey Levy

The 2016 NFL Combine is over and the season of metrics has begun. Who shocks us with their ability to jump? Which player runs so fast, we collectively forget they aren’t very good at football? Whose stock is about to shoot up? What can we learn?!

In my January article I discussed why combine metrics matter, even if it’s hard to tease out their relationship to success. I also cautioned readers about the dangers of the ubiquitous two-axis graph. In a similar vein, today I have an appeal for fantasy football writers and another caution for fantasy football readers: be careful with that average value!

“Be serious Jeff,” you might be saying. “Averages are so simple, and we use them all the time. What could go wrong?”

Averages are indeed simple and useful, but an average doesn’t tell the entire story of a set of data. What’s missing is the story of variation.

Confidence Intervals

What we probably want to know is just how confident we are that the average being measured is correct. If we’re just trying to describe a sample, there isn’t any need to worry about this. That is, if we’re limiting our research to saying “I have data on five years of QB scoring, and this is the average in that span,” then we can state that with complete confidence. More often, however, when we do this we’re trying to extend our results into the realm of prediction in a broader population. That is, “I have data on five years of QB scoring, and this average can be extended to make assumptions about some sixth year I don’t have data on.” In that case, the degree of confidence in our prediction is a crucial thing to report.

To illustrate this, let’s take a look at some real data on rookie average draft position versus their first season fantasy points. I used DLF ADP data on rookies from 2013, 2014 and 2015, then combined that with the number of PPR points they scored according to Pro Football Reference. If we simply combine these years and plot them, we get this:

[am4show have=’g1;’ guest_error=’sub_message’ user_error=’sub_message’ ]

average_chart

Basically what we would expect. First round picks (defined as the first 12 players) score an average of around 30 more points per season than second round picks, who outscore third round picks by about the same. Third round picks outscore fourth round picks by about 25.  So… are we done?

Not quite. Imagine there’s a jar with a whole bunch of numbers in it (the population), and we want to know what the average value of all of them is. If we randomly pick out just a few (the sample) and find those few have an average value of 42, how confident are we that the average value for ALL the numbers in the jar is 42? The answer will depend on two things: how many we have picked out to look at, and how widely spread the values are on them.

  • If we randomly pick out 10 numbers and every single one of them has a value of 42, we might be pretty confident that 42 is a true average for the entire jar.
  • If we pick out 10 numbers and they range in value from 0 to 200, we’ll be a lot less confident that 42 is the true average for the jar.
  • If we pick out 100 numbers and they still range in value from 0 to 200 but the average doesn’t change, we’ll be more confident than we were that 42 is the correct average.

Fortunately this can be quantified and used nicely alongside average values to illustrate a story more accurately. It’s called a confidence interval, and it’s essentially asking the question: “Given this sample, what is the range in which I can be <X> percent confident that the true population average lies inside it?”

In Excel, adding confidence intervals is easy using the =CONFIDENCE.NORM(alpha, std, size) command. Alpha is just one minus your desired confidence level (so for 90% it’s: 1 – 0.9 = 0.1), std is the standard deviation of your data series, and size is the number of items in your data series. It then gives you a value, which you add to your average to get the maximum interval, and subtract from your average to get the minimum interval. Let’s add a 90% confidence interval to the graph we made above and take a look at it in action:

 

confidence_chart

For the first round, our measured average in the sample is about 127.75. The calculated confidence interval is about 21.4, giving us a minimum of 106.4 and a max of 149.2. This allows us to say “Based on this data, we are 90 percent confident that the true average value for the first round falls in between 106.4 and 149.2.” Note that the confidence interval actually isn’t the same in each round; it just sort of looks that way because the change between rounds is too small to show up well in this graph.

The most important thing to notice is that the minimum interval for the first round is actually below the maximum interval for the second round, as illustrated by the dashed black line. This tells us that, at a 90 percent confidence level, we can’t be certain whether or not first and second round players have a different average. That’s a fairly huge thing to omit from our analysis in the first graph!

There is no “right” confidence level. Often times you will want to try more than one value and show all the results, though 90 or 95 percent are fairly common choices. The more you lower your confidence level, the narrower the bands around your measured average will be; conversely the more you raise your confidence level the wider they will spread. This should make sense: I can be almost 100 per confident that you’re currently on the planet earth, although there’s a slim chance you became an astronaut without telling me. Maybe I can be 90 percent confident you’re in your home state, and 80 percent confident your home city, and so on down as we narrow the field. Let’s take a look at our data with a lower confidence level, then:

80_confidence_chart

The black line shows that the intervals for rounds one and two no longer overlap. This seems to tell us that first round players do indeed score higher than second round players, but that comes with a very important caveat: 80% is a very low level of confidence. Published work, for example, almost always uses 95%.I’ll spare everyone the details of hypothesis testing and assumptions that are needed for this, but it’s easy to look up if you want to know more. And even if we accept this low of a level (which we shouldn’t), you can see the overlap problem still persists between rounds two and three, and rounds three and four.

Conclusion

Hopefully it’s clear to see that adding confidence intervals gives us some very important information that we were otherwise lacking when only a measured average is reported. It tells us not only what the average is for any sample we’re working with, it tells us how confident we are that the average is the true one outside of this sample. I would encourage anyone writing a fantasy football article that implies projection to include them, and readers should definitely think about them when looking at the widespread use of average-only graphs in the fantasy community.

I should warn writers that often times including confidence intervals will be an unpleasant exercise. A lot of what we would like to measure suffers from small sample sizes, particularly when you allow for the way the game changes over time – we can only combine years when we’re assuming the fundamentals don’t change between them. However, in my opinion it’s much better to include confidence intervals that weaken my results, and then explain what it means to my readers, than to leave them out and make my article look better but also be less true.

Of course there are at least three major things missing from this analysis that make it overly simplistic. The first is deciding how to account for injuries (Kevin White, DeAndre Smelter, Marcus Lattimore), the second is position (QBs tend to score more than others at every level), and the third is the fact that in dynasty we obviously care about more than just their rookie season. But that’s okay, because my goal here was to illustrate confidence intervals, and not necessarily to treat this question as thoroughly as possible.

[/am4show]

jeffrey levy
Latest posts by Jeffrey Levy (see all)