step 4 – passer ratings

So. Where do we start? We know that passing is important, but does it predict the future? Man…I hope so. That would save me some steps.

0.5 Just for reference, I’ve created a passer rating in the data that’s built as follows:

Top to bottom: perfect, pretty good, medium, poor, no attack/overpass, error

1. Let’s kick it off with what most coaches are drawn to – the classic correlation example

We filtered out teams that played in fewer than 50 sets (given our data) as to avoid skewing that results from teams with small sample sizes.

2. This is basically what correlations look like visualized. You can see that as passer rating increases, so too does Set Won%. The strength of this correlation is decent at 0.57, thus putting the R2 at 0.33 (listed in the description of the trend line in the above chart)

In R, this is what the code I’m using looks like. Nothing too crazy.

3. Ok, conceded, passer rating has a relationship with a team’s ability to win sets, but of course we always heed the adage of “correlation does not prove causation” – so let’s look if the difference in two teams’ average passer rating can explain the past.

4. So we build the model the same way we have in the past. We find the passer rating average of the two competing teams in the set, find out who won the set, then see if the difference in passer rating between teams 1 & 2 plays a substantial role in determining the outcome of the set.

5. From what we see in our results, our pseudo-R2 is a measly 0.095, meaning our passer rating difference model only accounts for 9.5% of variance in the actual results.

6. So, not promising – but how well can we predict the future? If we knew the average passer ratings of each team, how psychic could we be?

ggplot(predicted, aes(x=passrating_diff, y=won_the_set)) +
geom_point() + geom_smooth(method = “glm”,
method.args = list(family = “binomial”), se = FALSE)
logi.hist.plot(predicted$passrating_diff,
predicted$won_the_set,boxp=FALSE,type=”hist”,col=”gray”)

7. Not very psychic. Predicting the future is still difficult. Accuracy of 60.6%. A pseudo-R2 of only 0.072 or 7.2% of variance explained.

8. Key question: are we evaluating passing correctly?

8a. I would argue, no.

9. Attacking uses the language of scoring points – while passing talking about the number of hitters available. Points are points, whereas “number of hitters” available has no distinctive value when we use passer rating.

Big Moneyball guy over here, so here’s that quote:

DePodesta / Hill: There is an epidemic failure within the game to understand what is really happening. And this leads people who run major league baseball teams to misjudge their players and mismanage their teams.

Beane / Pitt: Go on.

DePodesta / Hill: Okay. People who run ball clubs, they think in terms of buying players. Your goal shouldn’t be to buy players, your goal should be to buy wins. And in order to buy wins, you need to buy runs. You’re trying to replace Johnny Damon. The Boston Red Sox see Johnny Damon and they see a start who’s worth seven and a half million dollars a year. When I see Johnny Damon, what I see is…is…an imperfect understanding of where runs come from. The guy’s got a great glove. He’s a decent leadoff hitter. He can steal bases. But is he worth the seven and a half million dollars a year that the Boston Red Sox are paying him? No. No. Baseball thinking is medieval. They are asking all the wrong questions. And if I say it to anybody, I’m, I’m ostracized. I’m-I’m-I’m a leper. So that’s what I’m-I’m cagey about this with you. That’s why I…I respect you, Mr. Beane, and if you want full disclosure, I think it’s a good thing that you got Damon off your payroll. I think it opens up all kinds of interesting possibilities.

https://www.youtube.com/watch?v=TpBcwGOvO80

10. So we need to speak about Serve Receive – and frankly, all other skills – in terms of what we actually care about: winning & losing points. What if we framed all contacts in this light?

Step 5.