Photo Credit: Keith Allison via Flickr Creative Commons

This is Part III of a three-part series (Part I, Part II).


The NFL rookie quarterback class of 2016 has fan bases around the country with a long list of questions:

  • Los Angeles: Is Jared Goff already a bust?
  • Philadelphia: Can Carson Wentz be a franchise quarterback?
  • Dallas: Will Dak Prescott be an all-time great?

While those three franchises head into the 2017 season with their starting QB set in stone, the Week 1 starter for the Cleveland Browns remains an open question. That’s because most of us in Cleveland remain unsure about the following question:

  • Cleveland: Can Cody Kessler be a playoff-caliber starting quarterback in the NFL?

This last question has motivated this three-part series on Kessler. But the historical perspective offered in Part I and the most likely career comparables provided in Part II only indirectly address this question.

So let’s try to directly answer this question. And while we’re at it, let’s attempt to answer the questions from Los Angeles, Philadelphia and Dallas too.

To do that, this study will create a predictive model of a quarterback’s career success in the National Football League based on their physical characteristics, draft status and early-career performance. While NFL analytics staffers may (or may not) have already completed similar work on a proprietary basis, this study represents the first known empirical research on this subject available in the public sphere (including academic journals).

(Note: If you want to skip all the theoretical background and simply find out how the predictive model forecasts the careers of the 2016 rookie QB class, scroll down until you reach the “Results: Player Predictions” section.)


The predictive model employed in this study is as follows:

Career Success = f(early-career performance, physical characteristics, age, experience, draft position)

The predictive capacity of this model requires a large sample of players where one can empirically connect early-career performance with final career outcomes. To those ends, this study includes all quarterback seasons from 1970-2016. Establishing a minimum threshold of 150 pass attempts in a given season, the sample features 350 quarterbacks and 1,644 unique player-seasons.

The use of 40+ years of data limits the analysis to variables that are available for all quarterbacks since the NFL-AFL merger; as such, this excludes performance measures such as QBR and Pro Football Focus ratings. As a result, this study will feature the six performance metrics identified in Part II of this series and available at Pro Football Reference across the entire length of the sample:

  • Completion percentage (comp / att)
  • Yards per completion (yards / comp)
  • Touchdown percentage (td / att)
  • Interception percentage (int / att)
  • Sack percentage [sack / (sack+att)]
  • Rush yards per game (yards / game)

In building a predictive model, one must adjust these statistics to account for different eras in NFL offenses due to rule changes and strategic innovations. For instance, the average NFL quarterback in 1970 had a completion percentage of 51.1%; in today’s NFL, this level of performance may end a QB’s career. As a result, this analysis normalizes player seasons as compared to the average value of quarterbacks with 150+ pass attempts in each year. The result is a set of six z-scores for each rookie when compared to the league average; this puts every QB since 1970 on a level playing field when it comes to evaluating their early-career performance. Z-scores for interceptions and sacks are reverse scored such that positive numbers are better.

In terms of the other independent variables in the model, physical characteristics are accounted for by the z-score of a quarterback’s height when compared to the league average at the position in a given year (as discussed in Part II). Age is simply the player’s age in years as of December 31st of the season in question; this will account for the expected differences in a player’s performance in, say, their age 21 season when compared to their age 23 season. Experience is evaluated using (a) an indicator variable (yes/no) whether the player is a rookie and (b) the player’s accumulated “approximate value” (AV), if any, prior to the season in question.

Finally, draft position is evaluated as the natural log (ln) of the pick at which the player was drafted; this is the same method employed by current Browns analyst Kevin Meers in producing his oft-cited draft pick valuation chart. The draft position variable controls for the presence (or absence) of otherwise unobservable traits that may be more predictive of career outcomes than a player’s early-career performance. For example, while Peyton Manning threw more interceptions than touchdowns as a rookie, the traits that propelled the Indianapolis Colts to select him first overall were presumably the same that led him to a Hall-of-Fame career.

The dependent variable, career success, is formed as an ordinal variable with five outcomes based on a player’s career approximate value (CAV). In other words, career success takes on an integer value between one and five based on the following categories:

  • 5 – Elite Quarterback (CAV > 120)
    • Examples: Peyton Manning, Philip Rivers, Ken Anderson
  • 4 – Franchise Quarterback (80 < CAV < 120)
    • Examples: Phil Simms, Jake Plummer, Bernie Kosar
  • 3 – Starting Quarterback (40 < CAV < 80)
    • Examples: Doug Williams, Trent Dilfer, Rodney Peete
  • 2 – Journeyman Quarterback (25 < CAV < 40)
    • Examples: Josh McCown, Tim Couch, Tommy Maddox
  • 1 – Backup/Bust Quarterback (CAV < 25)
    • Examples: Kyle Boller, David Klingler, Todd Philcox

This model will be estimated via ordered probit regression. In a nutshell, this is a multivariate empirical approach that recognizes the ordinal, or tiered, nature of the five categories. Since the goal of this study is to link a player’s early-career performance to their career success, the model will be estimated for all unique player-seasons where the quarterback was age 24 or younger. More narrow requirements—such as examining only rookie QBs—resulted in much lower sample sizes and, when combined with collinearity issues, produced considerably less reliable estimates. In terms of other restrictions, the sample was limited to quarterback seasons of 150 or more pass attempts and players who were selected within the first 255 picks of the NFL Draft (to reflect the current draft structure).

Since the initial goal of this study was to develop predictions about the 2016 rookie class, the parameters of the model were estimated for all seasons between the years 1970 and 2015.  While the classification of retired quarterbacks was straight-forward—based solely on CAV—the treatment of active, non-rookie players required some judgement calls about the trajectories of their careers. I tried to be conservative. First, four players were moved up to “elite” status: Cam Newton (94), Matthew Stafford (86), Russell Wilson (84) and Andrew Luck (64). Second, four quarterbacks were placed in the “franchise” level: Andy Dalton (73), Derek Carr (30), Jameis Winston (25) and Marcus Mariota (22). Finally, Kirk Cousins (31) was promoted to “starting” level.

OK, enough with all of the empirical backstory. What the hell is the point of all of this?

The estimation of the career success model will offer two valuable sets of insights. First, the regression results will allow for the examination of how each of the independent variables affect—or do not affect—the likelihood of a young quarterback having a successful career in the National Football League. In other words, does a player’s height really matter? Are throwing touchdowns more important than avoid interceptions? How relevant is a QB’s draft position? By examining the magnitude of the coefficients and their statistical significance, this study may provide some answers on what matters and what doesn’t.

Second, the regression results will allow this study to directly address the questions posed at the beginning of this study. Using a player’s characteristics and the coefficients in the model, one can calculate the estimated probability that a young quarterback’s career will end up in each of the five categories above. Since this study is geared towards examining the 2016 rookie QB class, we can determine the likelihood that, say, Dak Prescott and Carson Wentz become all-time greats. And the probability that Jared Goff will be a bust.

And, to the point of this entire series, these results will provide the estimated probability that Cody Kessler could develop into a franchise quarterback in the NFL.

Results: Regression Results

The results of the career success model are presented in Table 1 (below). The top half of the table reveals some expected results: early-career performance is positively related to improved career outcomes. All six playing statistics variables demonstrate a positive correlation, with three demonstrating statistical significance with at least 90% confidence: completion percentage, interception rate and sack avoidance.

Table 1 - OProbit QB Regression

It may be initially surprising that neither touchdowns nor yards per completion are statistically significant. However, this outcome is the direct result of collinearity within the model. In addition to a large correlation between the two variables (r=0.476), each variable zooms to 99% statistical significance when the model is re-estimated in the absence of the other; all other passing variables similarly increase in statistical significance. Finally, the decision to cluster the standard errors by player (to account for a player appearing in the sample multiple times) also has an effect; when clustering is removed, all passing variables become statistically significant. In other words, it certainly appears that early-career passing performance—including touchdowns and yards per completion—is incredibly important in predicting career outcomes of NFL quarterbacks.

This, of course, is not a surprise.

But it is revealing that the coefficient on rushing yards per game is not statistically significant… and isn’t significant in any reasonable variation of the model. While this may not be shocking news, it does emphasize the importance of a quarterback being able to win from the pocket in the NFL; the ability to a quarterback to gain yardage with his legs does not have a statistically significant impact on a player’s ultimate career success.

The other variables have a predictable effect. The later a quarterback is drafted (i.e., higher numbers) has a negative and statistically significant effect on career success, all else equal. Players with greater track records of NFL success—as measured by accumulated CAV prior to the season—are linked with better career outcomes. Finally, while the results indicate that age and rookie status are do not have a statistically significant effect on career success, collinearity is also the issue here. In addition to a high correlation between the two (r=-0.528), the removal of one of the variables leads to the other becoming statistically significant with 95% confidence.

In sum, the model seems to be effective. The R-squared (0.1419) is low—meaning that there is a lot of variation in career outcomes that is unaccounted for in the model—but not unreasonable, the coefficients are all appropriately signed and the chi-square value indicates that the model has some level of predictive power.

Results: Player Predictions

OK, so let’s get to the good stuff.

What does this model ultimately forecast about the futures of the 2016 NFL rookie quarterback class? Table 2 (below) provides the results.

Table 2 - OProbit 2016 Rookie QB Career Predictions

Coming off a historic rookie campaign, the model clearly favors the long-term prospects of Dallas quarterback Dak Prescott. His odds of becoming an elite quarterback (35.2%) are greater than the rest of the 2016 rookie quarterback class combined. Further, the model denotes that Prescott has a greater than 60% chance of producing 80 CAV or more over the course of his career. While Prescott’s estimated backup/bust percentage (9.5%) is surprisingly high, it is largely a product of the model’s penalty attached to his draft position (#135 overall). When draft position is removed from the model and re-estimated, Prescott’s estimated bust percentage falls to 4.8%; at the same time, the odds that he becomes an elite quarterback (120+ CAV) climb to 54.0%.

The estimated probabilities in Table 2 demonstrate that it is still far too early to pass judgement on the ultimate career trajectory of Carson Wentz. While the model indicates a 22.0% likelihood of the Philadelphia QB developing into an elite QB, this is counterbalanced by an 18.1% probability that he is a bust in the NFL. The uncertain career of Wentz is also reflected by the fact that the model gives at least a 17% chance that he will fall in each of the five categories. While the model may be overestimating the odds that Wentz fails to reach 25 CAV—he already has 10 CAV and the starting job in Philadelphia for the near-future—the results nevertheless indicate that any definitive proclamations about Wentz’s career trajectory at this point are incredibly premature.

The plight of Jared Goff, however, looks a bit clearer. After the fourth-worst rookie season by a quarterback since 1970 (see Part I), the model is pessimistic that the Los Angeles Rams QB will develop into an elite quarterback. While the model does not eliminate the possibility, a 5.4% likelihood is not likely what Rams GM Les Snead was hoping for when he took Goff number one overall in the 2016 draft. More glaringly, the results of Table 2 suggest that Goff has roughly a one-in-two likelihood (46.7%) of being a bust. However, before anyone definitively closes the book on Goff’s career prematurely, it is reminded that the quarterback with the worst rookie season in modern NFL history—former number one pick Alex Smith—progressed to become a solid, if unspectacular, long-term starting quarterback with the 49ers and Chiefs.

Which brings us to Cody Kessler.

If I were to summarize, it would appear that the model seems to confirm the wisdom of the crowd when it comes to Kessler. For those who have already deemed Kessler as a long-term backup, the model does indicate a 62.7% probability that Kessler’s career trajectory will follow that of a journeyman or backup QB. However, to those who think the Browns should stick with Kessler and bypass quarterback early in the upcoming draft, the model indicates that there is better than a one-in-five chance (20.6%) that Kessler develops into a franchise quarterback or better (80+ CAV); he also appears to be a slightly better long-term bet than Jared Goff even after controlling for differences in draft position.

Oh, and there’s a 7.0% chance that Kessler really is the next Drew Brees (i.e., 120+ CAV).

Across all four quarterbacks, the draft position variable is of considerable importance in how the model interprets each player’s respective career outlook. Comparable to the aforementioned penalty on Dak Prescott’s estimated odds of career success, Kessler’s draft position in the third round (#93 overall) substantially dims the model’s estimates of his career outlook. When draft position is removed from the equation and the model is re-estimated, the results indicate that Kessler has nearly a one-in-three chance (32.0%) of becoming a franchise quarterback or better (80+ CAV). This result edges out both Carson Wentz (29.5%) and Jared Goff (9.8%), both of whose estimated probabilities in Table 2 are inflated by their original draft position.

While this alternative model may be more optimistic about Kessler’s future, it is reminded that the draft position variable implicitly controls for the presence (or absence) of unobservable traits that may have led each player to be drafted in their respective positions. The differences in draft positions between Wentz and Kessler, for instance, may have been driven by differing calibers of arm strength and other characteristics. If pre-draft evaluations are inaccurate–such as concerns over Prescott’s accuracy–then models that exclude draft position may be more predictive for a specific player. However, in most situations, I would expect that including a player’s draft position will substantially improve the predictive power of the model by controlling for a quarterback’s traits that are not adequately reflected in his early-career playing performance.

As a final empirical note, estimated probabilities that a young quarterback may develop into an elite signal-caller may appear to be a bit inflated on the surface. However, one must account for the fact that many elite QBs were, in fact, quite awful in their first few years in the NFL (e.g., Donovan McNabb, John Elway). As a result, the model will not–and should not–completely dismiss the odds that a player develops into a top-flight quarterback unless their statistics are poor and their draft position is low. As a recent example, consider Jimmy Clausen, a 2010 second-round pick of the Panthers who posted a ghastly stat line in his rookie season (299 attempts, 52.5% completion, 3 TD, 9 INT, 9.9% sack). This combination would have led the model to predict that Clausen had less than a 0.3% chance of developing into an elite QB at that point.


So, just what do the Browns have in Cody Kessler?

The unsatisfying answer, unfortunately, is that it’s still too early to tell. Yes, there is a greater than 60% likelihood—according to the model advanced in this study—that Kessler’s career trajectory will match that of Colt McCoy, Charlie Frye and other Browns journeymen and backup quarterbacks. But when paired with the favorable comps for Kessler from Part II of this series—Teddy Bridgewater, Andy Dalton and Drew Brees—it is apparent that there is some potential that Kessler’s career takes quite a different turn and he develops into something substantially more than that.

In other words, this whole three-part series may just have confirmed what most of us Cleveland fans thought in the first place.

While this outcome may seem like a disappointment on the surface, I actually find the confirmation of accepted wisdom to be incredibly exciting. This is because my underlying objectives for this three-part series were actually much, much larger than simply trying to figure out the possible career trajectories of Cody Kessler.

Let me explain.

In the baseball sabermetrics community, generating statistical comparables for individual players is old news: Bill James introduced similarity scores in the 1980s. More recent work by Chris Mitchell at FanGraphs uses statistical modeling to estimate the probabilities attached to various career trajectories of minor-league baseball players. These efforts provide valuable perspective on what to expect—and how to evaluate—young players in baseball based on early-career performance.

But, until now, similar work in football has been practically nonexistent.

The only known work on either of these two questions has come from Pro Football Reference and its development of similarity scores. However, unlike the system used to evaluate baseball players, the approach used to examine NFL players is way too simplistic to be of much interest. By only comparing veteran players’ season-to-season AV—the one-number, end-of-season score—this method makes no attempt to analyze how two players might have compiled those numbers. As a result, this approach produces lists of player comparisons that are misguided and, at times, laughable (e.g., Michael Vick’s top comparison is Joe Theismann).

And did I mention that Pro Football Reference seemingly requires six professional seasons before they publish similarities?

This is not to criticize the folks at Pro Football Reference. Not only is the site a wonderful resource—it was instrumental in this three-part series—but they at least tried to look at this question and openly acknowledged their methodological shortcomings. But, until now, the problem has remained: there has not been a meaningful, systematic approach available to the public—at least, to which I am aware—that offers any historical perspective on young players in the NFL and the potential career trajectories as intimated by their physical characteristics, draft position and components of early-career performance. As a result, NFL fans are in the relative dark about making player projections when compared to their counterparts in the baseball community.

I am (hopefully) not arrogant enough to presume that this three-part series on Cody Kessler has definitively solved this problem. But the consistent reasonableness of the outcomes in Parts II and III suggest that the approaches advanced in these studies demonstrate considerable promise in addressing these questions. In doing so, it is hoped that this work has brought football analysis just a little bit closer to the advancements made in sabermetrics over the last three decades.

So, as a final recap, what did this series do?

Part I: Established an empirical methodology to rank quarterback seasons across eras.

Part II: Developed a methodology to determine the most appropriate comparables for quarterbacks across eras.

Part III: Built a predictive model of career trajectories for young quarterbacks based on physical characteristics, draft position and early-career performance.

While these studies may not have solved the riddle of Cody Kessler’s time in Cleveland—or made the Browns decision to draft a quarterback any clearer—I am excited by the potential that these studies can offer in the evaluation of quarterbacks in the NFL. I don’t profess to be done tinkering with these methods; I may add more variables to the methods advanced in Parts I & II, and the collinearity issue in Part III remains an open question with me. However, even in their current form, I can’t wait to unleash these tools on other NFL quarterback questions as I continue to develop this site, Cleveland Sports Economics.