Replacing Passer Rating: Introducing RIPPEN

I’ve always hated passer rating.  It’s on a bizarre scale (maxes out at 158.3 in the pros), it weighs passing touchdowns too heavily (imho), and it doesn’t mean anything (what is 120 vs 80? No one knows).  So, I’ve been thinking about this for a while, and I’ve finally come up with the replacement statistics for passer rating (this replacement, unlike the refs, is a good thing.)  And so, in the spirit of great statistical sports acronyms like PECOTA, it is with great pride that I introduce to the world, Rush Independent Passing Player Efficiency Number, or RIPPEN (pretty good, huh?).

So what is it?

The basic idea

In a simulated world, how well would an offense perform if they started every drive from their 20 yard line and only ran passing plays with this quarterback’s stats.  In this simulated world, a drive either ends with a touchdown (7 points), a field goal (3 points), or a turnover (0 points) (The defense never scores).  RIPPEN is  the expected number of points that an offense will score  in 10 possessions (approximately one game) running only passing plays based on the specific quarterback’s statistics.

Some Details

First, we need to create a data matrix of the results of every passing play for a quarterback.  This data matrix will have three columns: interceptions, completions, and yards. The first column contains zero and ones for not an interception and interception, respectively.  Likewise, the completions column contains zeroes and ones for incomplete and complete passes, respectively.  Finally, the yards column contains the number of yards gained for a completed pass.

Now this data is used to simulate a drive starting from a teams 20 yards line and results in either 0, 3, or 7 points.  7 points are scored if they amass at least 80 yards before getting to a fourth down situation or interception.  3 points are scored only if the team gets to a 4th down and is inside the 40 yard line.  They score a FG with probability proportional to the distance from the goal posts ranging from 100% for a 0 yard FG try to 50% for a FG try from the 40.  0 points are scored if an interception occurs, a fourth down occurs outside of the opponents 40 yard line, or if a field goal is missed.  A game is simulated as 10 possessions, and a game is simulated 10000 times.  The average score of these game simulations is RIPPEN.

Advantages

RIPPEN is easily interpretable

RIPPEN means something.  Ponder’s RIPPEN after three weeks is 29.167.  This means that if you played simulated games over and over again with just him passing every play, on average, that team would score about 29.167 points on offense.  That means something.  Meanwhile, his passer rating is 104.9?  Is that good?  What does that mean?  Who knows?

Further, the scale of RIPPEN is essentially on the same scale as points scored in a football game.  Theoretically, the upper limit of RIPPEN is 70 (a touchdown on every drive in every simulated game) and the lower limit is 0 (a team never scores on any drive in any simulated game).  This is much better than 0 to 158.3.

RIPPEN ignores passing touchdowns

Passing for a touchdown is more a function of what types of plays the offensive coordinator is calling than how good you are.  If you’re driving your team down the field consistently and then punching in TDs from 2 yards out on a rush up the middle, the QB should not be punished for this.  Likewise, the QB shouldn’t be rewarded for multiple short passing touchdowns.

Disadvantages

RIPPEN ignores rushing yards

Quarterbacks like Michael Vick and RGIII are also a threat to gain yards on the ground, and RIPPEN doesn’t take that into account.  This can be viewed a disadvantage.  However, by removing rushing yards, quarterbacks can be compared on a level playing field (hence the RI in RIPPEN), so it’s not a huge disadvantage, but Vick and RGIII are going to appear lower in these ratings than they would be if this were attempting to be a total QB rating and not just a passer rating.  And besides, traditional passer rating ignores rushing yards entirely, too.  The next iteration of RIPPEN will incorporate rushing yards somehow.  But I’m not quite sure exactly how yet.

Results

Finally, here are the rankings after three weeks of games.

Player Team RIPPEN QB rating
Ponder, C. MIN 29.167 104.9
Brady, T. NE 27.468 97.0
Manning, E. NYG 27.292 97.1
Ryan, M. ATL 26.950 114.0
Schaub, M. HOU 24.764 102.4
Griffin, III, R. WAS 24.491 103.5
Smith, A. SF 24.108 102.7
Dalton, A CIN 23.890 105.0
Kolb, K. ARI 23.139 108.6
Roethlisberger, B. PIT 22.946 109.2
Newton, C. CAR 22.804 78.3
Flacco, J BAL 21.939 101.1
Romo, T DAL 21.201 89.3
Manning. P. DEN 19.626 85.6
Locker, J. TEN 17.651 91.9
Rodgers, A. GB 17.505 87.0
Stafford, M. DET 16.568 83.5
Luck, A. IND 16.344 75.4
Bradford, S. STL 15.841 85.4
Sanchez, M. NYJ 15.786 78.3
Cassel, M. KC 15.545 73.8
Rivers, P. SD 15.150 86.5
Vick, M. PHI 14.258 66.3
Brees, D. NO 14.201 77.0
Palmer, C. OAK 12.454 89.3
Wilson, R. SEA 12.335 86.2
Fitzpatrick, R. BUF 11.667 95.2
Tannehill, R. MIA 10.571 58.3
Freeman, J. TB 10.466 71.4
Cutler, J. CHI 10.096 58.6
Gabbert, B. JAX 9.358 85.8
Weeden, B. CLE 8.044 60.7

Surprisingly (or maybe not), I’ve got Christian Ponder ranked number one.  I thought this had to be a mistake, but it actually makes sense.  Go look at his numbers.  He hasn’t thrown an interception yet.  And his completion percentage is 70.1%.  I think RIPPEN is doing a good job here.

Another interesting case is Cam Newton.  His QB rating is 78.3, which puts him well below Blaine Gabbert, for instance, on that scale.  But a big advantage of Newton is that, while his completion percentage is only 63.6%, his yards per attempt is 9.6, the best of all the regular starters.  This means that Newton is completing a lot of long passes, which are useful for scoring touchdowns in both simulated game and a real game.  Think about it, if you completed a 50 yard pass every three plays, your completion percentage would be 33.3%, but you’d score on every drive.  And your QB rating would be terrible.  But big plays are a big help when you’re trying to score points.  So, I’d argue QB rating is under estimating Cam Newton.

On the other end of the spectrum, it appears that one of the more over-rated QBs this year is Ryan Fitzpatrick.  His completion percentage is under 60% (58.1%, to be exact), which is pretty bad, but he’s being held up in QB rating because he’s thrown 8 TDs this year.  That ties him for first in the league with Matt Ryan and Ben Roethlisberger.  But their completion percentages are both at least 10 percentage points higher than his.  Fitzpatrick is over-rated by QB rating.

A clear triumph of RIPPEN is the case of Blaine Gabbert.  Gabbert really isn’t that good, but he is the only other starting quarterback to not have thrown an interception yet this year, while throwing for four touchdowns.  Along with his other stats, this gives him a QB rating of 85.8.  Here is a list of players who have a lower QB rating than Mr. Gabbert so far this season: Peyton Manning, Sam Bradford, Matt Stafford, Mark Sanchez, Cam Newton, Drew Brees, Andrew Luck, Matt Cassel, Josh Freeman, Michael Vick, Brandon Weeden, Jay Cutler, and Ryan Tannehil.  That’s thirteen QBs and I don’t buy it.  Anyone whos been following his even remotely knows he’s having a terrible start to the year even without throwing any interceptions.  For instance, his completion percentage is 50.6%, which is worse than everyone in the league except for Mark Sanchez at 50.5% (!).  RIPPEN does a nice job reflecting this by putting Gabbert 31st in the league above only Brandon Weeden and his league leading 6 interceptions.

While I have omitted him from the RIPPEN table below, I’ve calculated John Skelton’s RIPPEN for his appearance in Arizona’s first game of the season:  8.527.  Only slightly better than Brandon Weeden.

Lastly, here is a plot of the probability that a quarterback scores simulated touchdown vs the probability their teams scores a simulated field goal.  You can see that Cam Newton is going to score a lot of touchdowns.  He’s up there with Dalton, RGIII, and Schaub.  The difference is that his simulated drives are ending in zero points much more often.  On the other hand, there are players like Wilson and Palmer who are leading their teams to three points about as often as Locker, Flacco and RGIII, but they are scoring many fewer touchdowns.

There are some interesting clusterings, too.  Ryan and Brady are right next to each other in the upper right hand corner, Palmer and Wilson are very close, and there is a cluster in the middle of the plot consisting of Bradford, Sanchez, and Stafford all right on top of each other.  Interesting.  That is all for now.

Cheers.

Posted on September 27, 2012, in Football, NFL, Sports. Bookmark the permalink. 19 Comments.

  1. Aloha Friend! its an Interesting Concept,though to me it LACKS the Fluid ADVERSITY,which is basically footballs underlying Principle Correct>?

  2. I should have Explained,by that I ment that there will ALWAYS Be ADJUSTMENTS Made By teams In Given Situations and those Adjustments become MORE REFINED as the Player Spends MORE Time in the League

  3. Just a technical comment, how do you source all the data from each game so efficiently? Do you maintain your own database of QB performance or source it from online? Do you do your simulations n R?

    Cheers,
    Jim.

    • I’m scraping play by play data from football-reference.com using the XML package in R. Once play by play data is pulled in, I pull out all the passing plays. From this I remove plays that were voided because of penalties and soem other strange plays. What is left behind is used to pull out the name of the quarterback, interception information, completion information, and yardage information. This information is stored in an R list containing one element for each quarterback. For each quarterback, another list is used to store a matrix for each game containing interception, completion, and yardage information. This can be used to pull out specific games or aggregate over a season for a quarterback. This is then saves as a(n?) .RData file, which gets updated every week.

      The simulations are run in R.

      Hope this answers your questions.

      Cheers,
      Greg

      • Hi Greg,

        That’s perfect, thanks for your reply! I have also been very wary of QB ratings. I remember when I learned of football (I am from the UK) I thought it was strange that you could get a perfect QB rating even with incomplete passes. I have run simulations before looking at the dynamics of QB rating equation across different passing percentages/number of touchdowns etc., and always found its behaviour very strange.

        I do like your RIPPEN at first glance though!

        Cheers,
        Jim.

      • One more thing – I appreciate this is more of an R question so feel free to ignore, but do you manually go to each games’ boxscore, or do you have code that can loop through each game? I assume the latter requires R being able to open webpages etc.? Still getting to grips with R, but really want to look into this.

        Cheers,
        Jim.

      • I have written R code that loops through each team and week and pulls out the box score. I update what week of the season we are in and just run the code and it does everything. R is pretty good.

  4. I don’t suppose you are willing to share that code? Or, have you published it somewhere?

  1. Pingback: RIPPEN Ratings – 10/2/2012 « Stats in the Wild

  2. Pingback: Week 4 QB performances reviewed via RIPPEN « Stats in the Wild

  3. Pingback: Week 5 QB Performances Reviewed Via RIPPEN « Stats in the Wild

  4. Pingback: Week 6 QB performances reviewed via RIPPEN « Stats in the Wild

  5. Pingback: Week 7 QB performances reviewed via RIPPEN « Stats in the Wild

  6. Pingback: “Vincent Picks” – The Best of WordPress « The Beltless Trenchcoat

  7. Pingback: RIPPEN after week 8 « Stats in the Wild

  8. Pingback: RIPPEN through week 9 « Stats in the Wild

Leave a comment