Tuesday, December 22, 2015

What is PER exactly and what does it measure?

According to John Hollinger, The player efficiency rating (PER) is a rating of a player's per-minute productivity. However, many basketball analysts use it as the best singular measure of a player's overall quality of play. PER is much more advanced than the simple Efficiency rating, and takes many more things into account than simple box score stats. PER tries to capture the effectiveness of what a certain player is doing while he's on the floor. PER has many pros and cons of being a one-all/be-all stat, but first I want to dive into what exactly PER's calculations are. John Hollinger's database only goes back to the 1988-89 season, but it can be easily calculated all the way back to when the league started through the tactics used by http://www.basketball-reference.com/about/per.html. I used all statistics dated from December 19th, 2015 (I know it's a little later than that), and for simplicity, I rounded the calculations to 4 decimal places.

So, without further ado, I want to break down PER on a step by step basis, so the casual fan can understand what exactly PER means, here is the formula for unadjusted PER.
uPER = (1 / MP) *
     [ 3P 
     + (2/3) * AST
     + (2 - factor * (team_AST / team_FG)) * FG
     + (FT *0.5 * (1 + (1 - (team_AST / team_FG)) + (2/3) * (team_AST / team_FG)))
     - VOP * TOV
     - VOP * DRB% * (FGA - FG)
     - VOP * 0.44 * (0.44 + (0.56 * DRB%)) * (FTA - FT)
     + VOP * (1 - DRB%) * (TRB - ORB)
     + VOP * DRB% * ORB
     + VOP * STL
     + VOP * DRB% * BLK
     - PF * ((lg_FT / lg_PF) - 0.44 * (lg_FTA / lg_PF) * VOP) ]
Now as seen here, the variables VOP, factor, and DRB% are not regular statistics used by traditional box scores. 
The way to calculate these statistics,  are also given by http://www.basketball-reference.com/about/per.html. 

factor = (2 / 3) - (0.5 * (lg_AST / lg_FG)) / * (lg_FG / lg_FT))
VOP    = lg_PTS / (lg_FGA - lg_ORB + lg_TOV + 0.44 * lg_FTA)
DRB%   = (lg_TRB - lg_ORB) / lg_TRB 
As seen in the formulas for all these 3 factors, the terms preceded with lg_ are indicating it as the league average of that statistic. 
So in order to find these league averages, I would find either a league average stat and use that, or find the total amount of that certain statistic over all teams in the league, and divide it by the average amount of games each team has player. 
Finding VOP(Value of Possession) is crucial is figuring out what the average amount of points in a possession is scored, across the league. 
VOP (Value of Possession) =  lg_PTS / (lg_FGA - lg_ORB + lg_TOV + 0.44 * lg_FTA)
So league average of points is given as a stat on BK ref as 100.8 as of 12/19/2015, and
VOP = 100.8/ [(2160/26) - 267/26 + 386/26+.44*(599/26)= 100.8/[83.0769- 10.2692+14.8462+10.1369]= 100.8/97.7908= 1.030772
The 2160 used is the FGA of all the teams in the league, and the denominator of 26 is the average number of games played by all teams as of 12/19/2015
The 267, 386 and 599 are also the totals of all the statistics of the league in their respective categories. 
All the rest of these values for factor and DRB% are pretty self explanatory if you look at the way we previously calculated VOP. 
Factor = (2/3) - (0.5 * (lg_AST / lg_FG)) / (2 * (lg_FG / lg_FT))
Factor = (2/3) – (0.5 * [(556/26)/(960/26)/[2* (960/26)/(454/26)](2/3)-[.5 * (21.3846/36.9231)]/[2*(36.9231/17.4615)]=(2/3) – [.5*(.5792)]/[2*(2.1145)= (2/3)- (.2896/4.229)= (2/3) – (.0685)= .5982 
Factor is calculated in order to find the amount of "assists" that should be counted towards the PER rating since Free Throws are not counted as made Field Goals, this is why the value is subtracted from 2/3.
 DRB % = (lg_TRB - lg_ORB) / lg_TRB DRB % = (1,128-267)/1,128= .7633
So, now that we found these factors that were variable beforehand, we can calculate the actual unadjusted PER of Stephen Curry of the Golden State Warriors, who was the leader in PER on 12/19/2015. 
 
These statistics are all taken from http://www.basketball-reference.com/?lid=homepage_logo.
uPER = (1 / MP) [To calculate on a per-minute basis]
* [ 3P (to account for the difference between FG and 3FG that isn't used in the equation)    
+ (2/3) * AST  (because Hollinger thought an assist was worth 2/3 of a point    
 + (2 - factor * (team_AST / team_FG)) * FG    (calculates the impact that the player's field goals has on PER) 
+ (2/3) * (team_AST / team_FG))) (calculates the amount of assisted field goals of the team)        
+ (FT *0.5 * (1 + (1 - (team_AST / team_FG)) (calculates the impact that the player's free throws has while factoring in that 
- VOP * TOV (subtracting the value of a turnover)      
 - VOP * DRB% * (FGA - FG) (subtracting the value of a missed shot)     
 - VOP * 0.44 * (0.44 + (0.56 * DRB%)) * (FTA - FT) (subtracts the value of missed free throws) 
+ VOP * (1 - DRB%) * (TRB - ORB)      (adds the value of a defensive rebound) 
+ VOP * DRB% * ORB     (adds the value of an offensive rebound) 
 + VOP * STL (adds the value of a steal)     
 + VOP * DRB% * BLK  (adds the value of a block, while factoring in that an offensive rebound may occur)  
 - PF * ((lg_FT / lg_PF) - 0.44 * (lg_FTA / lg_PF) * VOP) ] (subtracts the value of foul while factoring in the value of a free throw for the opposing team)  
 
That's a damn lot to swallow, and that's when I thought I should probably calculate it through Excel, but I said to hell with it, I said I should calculate it manually to show its calculation line by line.
 
 
(1/MP) = (1/902) = .0011086474 
 [*3P (129)
 + (2/3* 158) =  105.3333 
+[2-.5982(748/1104)*276= 277.5947
 +(151*.5*(1+(1-748/1,104)) = 1.32246* 151 * .5= 99.8460
 +(2/3)* (748/1,104) = .4517
 - 1.030772*96= 98.9541
 - 1.030772 * .7633* (530-276)= 199.8442 
- 1.030772 *.44*[.44+(.56*.7633))*(168-151)= 6.6882 
-1.030772 * (1-.7633)*(135-16)= 29.0341
 +1.030772 * (.7633) * (16) = 12.5886 
+1.030772*56= 57.723 
+1.030772*.7633*3= 2.3604 
- PF(51) *[(454/26)/(527/26)]- .44 * [(599/26)/(527/26)] * 1.030772= (51)(17.4615)/(20.2692)- .44(23.0385/20.2692)*1.030772= 
43.9355- .5155= 43.42 
So all these values are added and subtracted and multiplied by 1/Minutes Played in order to see what the PER was.  
 129 
+105.3333 
+277.5947
 +99.8460 
+.4517
 -98.9541
 -199.8442
 -6.6882 
-29.0341
 +12.5886
 +57.723 
+2.3604
 -43.42
 =34.03 PER for Stephen Curry
However, since Golden State is one the fastest paced team in the NBA, there needs to be an adjustment for the amount of possessions they get
Pace adjustment = lg_pace/ team_pace which 
= 96.1/99.7= .96389*34.03 = 32.8 is the actual PER of Stephen Curry
PER should also be normalized to 15/(lg_PER), but that would mean having to calculate everyone's unadjusted PER, which would mean me wasting more hours of my life.  

Now, after learning what goes into calculating PER, it's easier to think about what the strengths and weaknesses of the statistic are. 
 

Strengths 

- Shows percentages other than FG% to show how efficient a player actually is with his shooting
- Gives different weights for certain statistics according to their helpfulness/efficiency
- Takes all statistics with a relative value to a possession
- Factors in offensive rebounds as a possibility from a missed shot or block, which gives the possibility of a second possession
Even though, there are many positives of the stat, I believe there are many more negatives, actually. 
Weaknesses 
- Other than blocks and steals, doesn't account for defensive accomplishments
- For players who get a lower amount of minutes, but have a high usage rate (i.e Victor Oladipo lately), the numbers are skewed in their favor, because of the per-minute bias
- High volume shooters are rewarded if they shoot more shots above a certain percentage (my estimate was 34%)
- Fouls in my opinion, are included as an unnecessary subtraction, since it doesn't take "giving up an easy field goal" in its formula. Therefore, I don't think fouling, which give up free throws, are necessary to include
- The defensive statistics that are included, actually show that a player gambles more often than not, and should not be the main statistics used in deciding whether a player is a good defensive player or not. 

Nevertheless, I hope I explained PER well, and here are the current leaders in PER.