Exploring the Relationship Between Size and On-field Output
A Statistical Analysis By: Josh Scott
It was not too long ago that the physical traits were the most coveted traits when discussing wide receivers. More specifically, that size and speed were incredibly important. I seem to recall an era when I was younger where my Madden create-a-players where goliaths that could somehow run a 4.3. In that era, the only input that I would classify as mental would be “awareness”. Flash forward, and I think the remnants of this line of thinking haven’t fully been put to rest. As much as we try to ignore it, more than a few of us drool at the thought of the big WR with a solid speed score playing for our favorite team. I even found an article I shared with Matt that someone else wrote during RP’s infancy. Part of the title reads “Size, not Speed, Matters Most for Wide Receivers”.
So why am I rambling about this? While this line of thinking has faded considerably, Matt and I think we should truly explore these notions. We have learned an excessive amount about WR play from Matt and Reception Perception, but these remnants of the past consensus can still linger and bias our opinions of smaller wide receivers. In particular, we want to investigate whether size affects performance for NFL wide receivers, and if we can quantify a relationship between size and our outcomes of interest.
This question is more complex than it appears on the surface. Clearly size matters to some extent. There is a reason a guy like myself (my closest player comp would be Darren Sproles or Deuce Vaughn) are not commonly employed as NFL wide receivers. Once a player clears the necessary size threshold and is employed as an NFL wide receiver, however, to what extent does height and/or weight affect a player’s success rates versus coverage and true fantasy points per game? If there is no relationship, does it affect some other metric?
So let’s get to it, shall we?
2. Data
2.1 Description of Data Used
Many variables of interest all come from the Reception Perception NFL Database. The RP NFL database currently features over 360 individual seasons from wide receivers of all ages and types. The database features success rates versus various coverage types, success rates by route, alignment percentages, route percentages, target data and “in-space” data for each featured player for that respective year. A key feature of this data set is the repeated observations of many players over the life of the data set, which allows for cleaner analysis of relationships (assuming there is within-player variation). Three outcomes of interest that come from this database are success rate versus man coverage, success rate versus zone coverage, and contested catch percentage. Most controls also come from RP.
Another outcome of interest used is what we have coined as “True/consistent fantasy points per game”. The outcome of interest was calculated using the statistical outputs of the wide receivers that are observed in the RP NFL database. The annual statistics come from Pro Football Reference. The metric was calculated for a PPR format. True fantasy points per game is calculated to factor in the commentary that touchdowns and fumbles are not necessarily “sticky” on a year-to-year basis. Thus, true points per game is calculated without those two methods of fantasy scoring. It can be viewed as a composite of yards per game and receptions per game.
Height and weight data comes from NFLReadR. I matched the data from NFL rosters during the 2014- 2021 seasons with the RP database.
2.2 Summary Statistics
For descriptive statistics from this iteration of the RP database used, please see the joint article by Matt Harmon and myself.
The following table displays summary statistics for three variables: height, weight, and pounds per inch.
Here, we see that the average height is approximately 73 inches tall, which is about 6’1”. This also happens to be the median height, which is a nice result. The average weight is also about 203 pounds. The range in height is from 5’7” to 6’5”, while the range in weight is from 165 to 245. One interesting note here is that I also split the sample in half, and the averages and medians were almost identical to each other (and to the full sample statistics). The main difference would be in the extreme tails, with a slightly smaller minimum weight in the second half of the sample. This is important to keep in mind for the results section.
2.3 Simple Assessment of Two-Way Graphs
As a preliminary assessment, we can take a look at simple two-way scatter plots to see if we can discern any visible trends between the two variables. It is important to remember that this type of graph helps us see what the data looks like, along with simple trends. These graphs do not control for any factors, and conclusions should not be drawn from them. However, this can point us toward aspects to keep in mind when we ultimately move to the main portion of this investigation.
2.3.1 True Fppg and Size
With this first figure, I do not see a clear trend between height and true/consistent fantasy points per game. I am also not convinced there is a clear trend with weight, although the trend line is slightly positive. This at least gives us a starting point to see if these trends stick or if controlling for more factors uncovers a more clear relationship.
2.3.2 Success Rate Vs Man and Size
Again, I am not going to take too much away from this due to a lack of context and controls. However, I find it interesting that the trend lines are both downward sloping. This is the opposite effect expected, since the “size matters“ narrative indicates the slope should be positive for both. We see the same issue with success rate versus zone and size, shown below. This is interesting because my previous joint article with Matt showed that success rates have a statistically significant and positive correlation consistent points per game, yards per game, receptions per game, and targets per game.
2.3.3 Success Rate Vs Zone and Size
2.3.4 Contested Catch Rate and Size
Here, we see an indication where size may be correlated with a particular outcome of interest. While early evidence indicates height and weight may not help you gain consistent fantasy points, size may help in contested situations. I will further explore all of these relationships with regression analysis, while also checking to see if trends changed over the course of the sample.
3. Model
Due to the lack of variation in NFL height and weight data for a particular player over time, I am going to use a pooled regression model and account/control for player traits and time using indicator variables. While panel data usually allows for cleaner correlations with the usage of something called ”fixed effects”, using player fixed effects requires variation in these variables for a particular player. Indicator variables for both year and player will still catch much of unobserved effects and will be sufficient for this exercise. This model aims to answer the question that matters most to us: Once a player is employed by the NFL, how much does size affect the outcomes of interest? The outcomes of interest are true pts per game, success rate versus man coverage, success rate versus zone coverage, and contested catch rate. The explanatory variables of interest are height and weight, while the main controls are the remaining success rates, age, age squared, routes, percent on the line of scrimmage, percent of routes on outside, and indicators for year and player. One interesting caveat to this problem is the inherent selection bias that occurs. There is clearly some threshold to which size matters, even if size does not matter much once employed by the NFL. However, the true correlation between size and these outcomes are most likely not of interest to us, because we want to know the relationship amongst relevant (employed) receivers.
In a future installment (if there exists demand for it), I will try a Heckman Selection model to see if I can account for this fact. The Heckman selection model is a statistical technique used to address sample selection bias in econometrics. Sample selection bias occurs when the sample used in an analysis is not random, which can lead to biased estimates. In this case, players may have initially been selected into the NFL based on meeting a size threshold, which we will try to account for in this model. This model I would mostly be ran to satisfy any curiosities, and show that we did not ignore the selection issue (even though it is less important in this instance).
4. Results
Here, I discuss the main results of the models. With the model, I first look at the results for the full sample of years. In order to acknowledge that the game of football and various strategies change over time, I will follow this up by only looking at the results of the pooled regression model with only the most recent half of the data (2018-2021). If trends have changed and size matters more or less in more recent years, we want to make sure we take this into account.
4.1 Pooled Regression Model: Full Sample
In this table, I omitted coefficients of most controls, as we do not need to interpret the coefficients. If you are curious with how those controls correlate with receiver outcomes, you can view the previous article co-authored by Matt Harmon. Here, our main interest is the relationship between the two size variables and the outcomes of interest. The four outcomes of interest are log true points per game, success rate versus man coverage, success rate versus zone coverage, and contested catch rate. Each regression for a particular outcome is represented by a column. Standard errors are in parenthesis below the coefficient.
Before even giving an interpretation of the coefficients, let us discuss our main finding here: that none of the coefficients are statistically significant. An asterisk next to the coefficient would indicate this. A lack of statistical significance means that the variance of the relationship is wide in relation to the covariance of the outcome and explanatory variable. This means we cannot reject the idea that any correlation is just a fluke. Additionally, we actually see a few negative coefficients for height and weight. Even if they were statistically significant, this is the opposite effect we would expect if size mattered. However, splitting up the sample may give us more information regarding these results.
4.2 Pooled Regression Model: 2018-2021
Again, we see that none of the coefficients for the explanatory variables are statistically significant, and we see that some coefficients are still negative. Although not significant, I still want to translate the coefficients for those who haven’t read my prior work. In column 1, a one pound in increase in weight (or one inch increase in height) is correlated with a 0.3 (coefficient multiplied by 100) percent increase in true/consistent fantasy points per game. For column 2, a one pound increase in weight is correlated with a -0.266 percentage point decrease in success rate versus man coverage. This same interpretation applies to columns 3 and 4, since the outcomes of interest are percentages.
5. Discussion
While we know that height and weight are important to some degree in order to be employed in the NFL, the question that mostly concerns us is whether we should worry about size once a wide receiver is already employed and met the minimum requirements. The main two tables show that there is not a clear discernable correlation between size and player outcomes. If we think about the mechanism for what truly makes a wide receiver good (getting open), this result makes a lot of sense. There are many techniques film analysts like Matt Harmon reference what discussing how a wide receiver gets open. According to Matt, two key techniques that increase success rates vs coverage are timing and deception. While height may help in a couple circumstances, it is not sufficient for getting open. If more weight is anecdotally correlated with less agility, it is also understandable why this variable does not have a clear correlation with the outcomes.
Height and weight also don’t precede inherent and unobserved traits like motivation, tenacity, or situational awareness. These traits may be a large factor in why the top contested catch artist in this year’s prospect class is not a large WR. When utilizing an indicator that controls for a player and their unobserved traits, we simply cannot discern whether there is a clear relationship between size and these outcomes of interest. Thus, do not eliminate smaller WRs from your draft board, especially if we see them excel on the field and they have the requisite success rates to prove it.