EDIT NUSS: Just a light read for you on a Sunday morning. Enjoy.
After reading a post on FanGraphs about NERD, I figured that I would try to make my own version for college basketball. The point of NERD is to use statistics about individual batters, pitchers, or whole teams in order to determine the excitement that they provide. That being said, let me introduce you to PEZ (Predicted Excitement based on Z-scores).
I started out by posting a survey online asking about various stats that make a game fun to watch (high scoring, fouls, teams involved, etc...). Out of the 21 options that I provided, along with a space for other ideas, I chose to incorporate the stats that make a game more interesting and also the ones that make a game quite boring. I have not found a way to use everything in exactly the way that I want so far, but I feel like I have made a pretty good first draft at this. I will discuss the elements involved first, present some of the results second, and then finish up with some commentary.
EDIT: As pointed out by Jeff, I figured that I should include a math glossary. Hopefully these four help.
1. Mean - everyone probably knows this, but just to be safe a mean is the "average" number of a population. It is found by adding up all of the numbers and dividing by the population size.
2. Standard deviation - This is a measure of how much the population deviates from the mean. If the numbers are very spread out (1, 5, 20, 79, 112, 300, 500,...) then the standard deviation will be high. According to the empirical rule, 68% of numbers fall within 1 standard deviations from the mean, 95% within 2, and 97.7% within 3.
3. Z-score - the z-score is a measure of how many standard deviations the number is away from the mean. It is calculated by subtracting the mean from the number and then dividing by the standard deviation. Example: Say the mean is 12 and the standard deviation is 3, how many standard deviations away from the mean is 9? (9-12)/3 = -1. So 9 is 1 standard deviation away from the mean (negative z-scores mean that the statistic is less than the mean).
4. Cumulative distribution - the area under the normal curve (bell curve) up to the specific z-score (think integrals if you like calculus, think about running away screaming if you don't like math).
High scoring games are more fun to watch. It's as simple as that. No one likes to watch a game where the ball is just passed around and then turned over so that the other team can pass the ball around. PTSZ is the z-score of each team's points per game average compared to the league average.
Ending a possession with a layup is like ending a sentence with a period. The three pointer is versatile. It can be used to fuel a comeback, put a team away, or to inflate your stats so that you can win the National Player of the Year Award when you were not deserving of it...cough Jimmer cough cough. Just like with PTSZ, 3PMZ measures the z-score of a team based on the average 3PM per game compared to the league average.
The more blocks that your team gets, the more fun it will be to watch because the other team is getting embarrassed. Same method here with z-scores.
Pace is measured by possessions per 40 minutes. Making a generalization, a faster pace leads to more points, more blocks, more of anything that makes a game interesting. Sticking with the trend, z-score here also.
Some people like really physical games. Those people should watch hockey, rugby, or some other sport. Basketball is meant to flow smoothly. When a possession is stopped due to a foul, momentum is stopped so that everyone can stand around and watch one guy take a couple of uncontested shots. When calculating the z-score here I multiplied it by -1 because when a team fouls more than average (normally positive z-score) they should actually be penalized instead of being rewarded.
The last part that I was able to include as of right now is a factor to account for being a top 25 team. The teams have worked hard to earn a ranking, so this is a way to account for other factors that make a team stand out to ESPN or other ranking systems (such as playing well as a team, good recruiting class, or being from some place other than the west coast (a scoring point in ESPN's rankings)). T25CP is calculated a little differently than the previous components. This is calculated by finding the z-score for the rankings (1-25) then multiplying that by -1 in order to give a higher z-score to lower numbers (higher rankings). Once the z-score was found, I then found the cumulative distribution and used that for the different rankings. All unranked teams were given a T25CP of 0.
The PEZ score of each team can be given by the formula:
PTSZ + 3PMZ + BLKZ + PACEZ + FOULSZ + T25CP
The mean of the results is 0. If the PEZ is high, that means that you should watch the game and if the PEZ is low, try to avoid the game and just read the recap if you are interested.
Top 10 w/PEZ
Long Island 7
Bottom 10 w/PEZ
Wright State -5
Cal Poly -5
Central Mich -5
Southern Ill -5
Eastern Ill -6
Western Ill -6
Pac 12 w/PEZ and ranking
UW 8 3
WSU 3 47
UCLA 3 49
Arizona 3 52
Colorado 2 65
Utah 1 109
Cal 0 159
Oregon 0 167
ASU 0 195
Stanford -1 209
USC -1 225
Oregon State -1 235
Looking at the top 10
First, I was surprised to see VMI at the top of the list and I will definitely try to find some of their games to watch when I have some free time. Second, as much as it pains me to see UW high ranked on any list, I am actually glad to see them high on the entertainment list because interesting teams bring in national exposure which could help WSU. Third, I'm glad to see Duke, Kentucky, Kansas, Syracuse, and UNC on the list because it shows that I'm at least on the right path.
Looking at the bottom 10
Pretty much all that I can say is that if you go to one of the directional schools in Illinois, I'm sorry.
Looking at the Pac 12
Wow, I really need to get used to writing Pac 12. Anyways, 4 of the Pac 10 teams were pretty entertaining to watch while the rest were watchable at best. Colorado was just 0.403 PEZ points lower than Arizona, so next year might be bearable after all...Our other addition, Utah, will bring in some more excitement by finishing just inside the top third of schools.
Looking at the future
I am still trying to find a way to incorporate the chance of an upset, chance of blowout, and rivalry games. In addition to finishing the formula, I am going to figure out a way to predict the entertainment of upcoming games so that people can find some way to watch them online or on TV. This will hopefully turn into a weekly post of games to catch.
Please feel free to comment about any stat that you think might add to the entertainment of a game and if you enjoyed this, check out my other blog posts: CBBMetrics