Question from VolleyTalk

So I got a message a couple months ago and 100% failed to respond, so I’m trying to do it some justice now. The question is about FBSO Efficiency, Passer Ratings, and Winning.

*Because the question was in reference to passing, I eliminated missed serves from the analysis.

VT_DM

So here are the 2017 numbers for the Big10 and Pac12 (only looking at matches in which both teams were from one of those two conferences). They’re sorted by their opponents’ average passer rating. So you’ll notice Maryland, Northwestern, and Indiana all get their opponents in a little trouble, but because these teams struggle to capitalize on these advantageous situations, their opponents still FBSO at a pretty high efficiency, dropping them deep into the red on the right side chart.

UCLA is on the other side of the coin – they don’t serve particularly tough, but their block/defense prowess holds opponents to a pretty decent 0.136 FBSO Eff. Nebraska and Stanford both serve tough, but also slow opponents with large blocks and scrappy defense, allowing them to rise into the top 2 of both statistics.

I would partially disagree with our VT friend that there is bias in evaluation of passer ratings. Typically yes, if you were to compare between multiple coaches who are charting the same 20 passers, you’d like end up with some different numbers – but because I’m using the VolleyMetrics codes for all these matches, we can assume a decent level of consistency.

The inherent reason to dislike passer rating is because as previously stated in many many posts here, the change in how likely your team is to win the point as you move from a 3 pass to a 2 pass to a 1 pass is not equidistance. Let’s just take FBSO. You might kill the ball at a 50% rate off a 3 pass, but only 38% on a 2 pass, and only a 10% chance on a 1 pass. If we assume the 3 to 2 to 1 relationship is linear, we reward sporadic passers who may average a 2.0, but with passes of 3, 1, 3, 1, 3, 1, 3, 1. We’d much prefer a passer who always just passes a dead 2. Never 3s. Never 1s. Because once you look at the value of each pass in terms of winning the point, the equation becomes much more clear.

Just a quick rundown on VolleyMetrics: R# is a perfect pass, R+ is very good, within 10 foot line, R! is a 2 pass, R- is a 1 pass, R/ means you didn’t get a swing (either you freeballed it or may have overpassed it), R= means you got aced.

VT_DM (men)

Here’s the equivalent chart for the 2018 men (using only matches when two teams in this chart played each other).

Very similar to UCLA on the women’s side, Long Beach isn’t even the toughest serving team in terms of getting opposing passers in trouble, yet they easily limit opponents to the lowest FBSO Efficiency out of the Top 10 teams – likely due to their block & defense.

If you were to run correlations on a per set and per match basis, you’d see strikingly similar results for both the 2017 women and 2018 men. The correlation between FBSO Eff in your team’s serve receive and your team’s likelihood of winning the set is 0.49. The rises to 0.59 when you look at the likelihood of winning the match.

While you may think the goal numbers are inherently different between the genders, for women, the top 5 teams in terms of FBSO Eff were Nebraska (0.274), Stanford (0.266), Penn State (0.265), Minnesota (0.255), and Wisconsin (0.238) – and on the men’s side it was Long Beach (0.278), Ohio State (0.257), UCLA (0.248), BYU (0.242), and Loyola (0.238). Pretty consistent between the two.

Advertisements

2018 Service Error (Men’s)

It’s been about a year since I posted, so I figured my unpaid sabbatical from this blog should come to an end soon. I snatched the 2018 men’s data, parsed it, and decided that the controversial (if you were to survey volleyball parents) topic of service error should be the first thing I dive into a little.

*all analysis will be done only using the top 10 teams from the final AVCA poll and will only include matches when a top10 team played another top10 team.*

Service Error % vs. Ratio of Won Matches Against Top10 Teams

We’ll start with Service Error and its relationship with Winning or Losing the Match. You’ll quickly notice that LBSU doesn’t lose a lot of these big matches – and that they, Hawai’i, and UC Irvine were relatively low in service error this season. But in the bigger picture what you see is that a team’s service error percentage and their ability to win top10 matches is actually almost uncorrelated. Just looking at the chunk gathered around 50% of matches won level, you’ll notice huge swings in service error% for those teams – an indication that SE% really is hardly correlated to winning and losing in these matches for these teams.

1

And if we alter the analysis to look at Sets Won or Lost in a top10 vs. top10 matchup, the trend holds about the same.

Opponent FBSO Eff vs. SE % per Match

The next chart we want to look at is how your team’s Service Error %  (x-axis) affects your opponent’s FBSO (y-axis). This is probably the best way to get the most bang for your buck in analyzing your team’s serving ability. While of course we are failing to standardize the level of play (Long Beach plays more top10 teams than Lewis does) and take into consideration rotational advantages (Long Beach’s Josh Tuaniga serves with a strong trio of blockers in front of him), this method of analysis remains pretty solid overall. In the viz, the label is the team who is serving and the goal is to have your team appear lower on the y-axis, indicating that your opponent is struggling more in the FBSO phase.

*we use FBSO efficiency (Team A 1st ball sideout – Team A 1st ball errors)/(Team B total serves) because a missed serve will be an automatic sideout, but a tough serve may force an attack error by the receiving team. The same reason we like attack efficiency rather than Kill Percentage – we want both sides of the coin, positive and negative since a point is scored each rally.

So what we see is that Long Beach (2018 Champs) consistently appears towards the bottom of the graph, indicating a strong ability to slow teams from siding out on the first ball. You’ll also see that their name doesn’t appear on the right half of the graph (higher service error) very often at all. Weird. UCLA on the other hand, captained by National Team Coach John Speraw, unapologetically takes the position that stronger serving is a must and the errors that come with it are acceptable. Obviously this line of reasoning isn’t total bs since the Bruins were the ones who almost defeated Long Beach in the national championship match. But we’ll spend more time looking at these two styles later on.

Opponent FBSO Eff vs. SE % per Set

Above is the same graph except broken out into Service Error% per team for each set. Just as with the per match data, there is a weak positive relationship between an opponent’s FBSO eff and your team’s service error %. If you just look at the 20% SE line, you can see that there are 19 sets alone that had 20% SE and have a wild range of FBSO Eff from over .700 to below -.100. Because this trend holds at most values of service error, the correlation cannot be described as anything above weak.

Opponent FBSO Eff vs. SE % per Season by Player

Here are the servers from 2018 on a top10 (in a match against another top10 team) that had at least 50 serves – making them regulars back at the service line.

Opponent FBSO Eff is again the metric we judge these players upon – and yes, as we mentioned previously, a player like Tuaniga might serve with DeFalco, Amado, and Ensing blocking so there may be some built-in bias…

Top servers will be lower on the chart and I’ll let you take a look on your own.

Opponent FBSO Eff vs. SE % per Match by Player

Opponent FBSO Eff vs. SE % per Match by Player (FBSO < .250)

Above are two charts, the top being how players perform on a per match basis (with 5 or more serves in the match). The bottom chart is just a zoomed in view for performances which held an opponent to an FBSO Eff of .250 or lower while the specific player was serving.

Top8

Finally, here is a per-team breakdown on servers. The trend lines are the correlation between SE% and Opponent FBSO Eff (more vertical meaning a more positive relationship). You’ll see the top servers on each team at the bottom again, as they hold opponents to the lowest FBSO Eff – regardless of their service error levels. These charts are for the entire 2018 season (in top10 vs. top10 matches). It becomes pretty clear for some servers why their FBSO eff numbers are so terrible as they are consistently located in the 30+% service error realm. Even if a 30% SE player was at a 10% ace level (an elite level in these top10 matches), the opponent would still be at .200 before you even consider how often receiving teams FBSO anyway. Missing your serve at that rate just makes being effective super difficult.

The final two teams are listed below:

ucla-lbsu

Here’s the strategy we were discussing earlier, visualized. UCLA in blue, LBSU in gold. You’ll notice that just on average, LBSU is getting teams into a worse position in terms of FBSO Eff. UCLA’s opponents average around the JT Hatch level of .400 FBSO Eff, partly due to the higher error from the service line from the Bruins. The interesting thing though is that we kind of get two clumps of servers, visualized below:

2

With cluster 1 being high error servers with an average FBSO Eff around .420, while cluster 2 consists of lower error servers with, revolving around the .300 level or so. Naturally the majority of cluster 2 players are from Long Beach while cluster 1 primarily consists of UCLA kids. Yes, you could say that if Arnitz and Hess could just make more serves, the graph wouldn’t be skewed this way, but reality is that only Micah Ma’a is in LBSU territory in terms of effectiveness.

So I’ll leave you to draw you own conclusions, but just know that Long Beach only hit Jump Floats 17% of the time, while UCLA hit them 23% of the time. So it’s not that UCLA is hitting 100% Jump Spin and therefore missing more often – therefore the answer may lie elsewhere.

Possibly it’s LBSU’s use of changing the pace of the serve perhaps more than UCLA? Perhaps their athletes are more competent in the skill or less afraid to fail or can manage a poor toss better?

Personally, I think it’s because in these top10 matches Long Beach holds opponents to a 0.257 attack efficiency while UCLA holds them to a 0.308 efficiency. 50 points is big difference. So yes, the two teams have differing styles of serving, I believe it’s the block/defense portion that leads to the even larger split in opponent FBSO Eff.

If anyone who actually reads these posts has any specific type of analysis you’d like to see presented, just shoot me a note!

MPSF Service Error (2017)

fbso sepercentage

Been waiting to dive into this for a while now, so let’s get right to it.

What you see above is the Service Error % in a given set and the opponent’s FBSO efficiency: (opp. won in FBSO – opp. lost in FBSO)/total serves. The teams you see as labels are the serving teams, so the bottommost point (USC), means that USC missed around 4% of their serves in that set and held their opponent to -0.115 or so in FBSO. Pretty impressive.

As you’ll see, the blob of data certainly trends positively, indicating that higher service error is associated with, but does not necessarily cause higher opponent FBSO eff. The R-squared of this trend line is only around 0.13, which is pretty mild. This would suggest that you can be successful at limiting your opponent’s ability to FBSO, even at a higher level of error (say 20-25%), as there are teams like UCLA (lowest UCLA dot) who missed around 24% of their serves in the set, but still held their opponent to a negative FBSO eff.

mpsf se wonlost 2

So the next question for me was: if service error doesn’t have a league-wide trend, does it help/hurt individual teams more than others? That’s what the above graphic helps to drill into. Similar to previous charts, blue/teal indicates the team won the set, red means they lost. The curves are of frequency distribution – meaning that for UC Irvine, the highest frequency of won sets occurred primarily in the narrow range of 18-23% service error while fewer won sets occurred outside this range – whereas Stanford’s curve for the bulk of won sets occurred in a wider range from 5-20% with only a few sets won outside this range.

The hypothesis of the casual men’s volleyball observer might be that higher levels of service error would of course manifest more frequently in lost sets, yet what we see is that for most teams, it doesn’t make a difference in terms of winning/losing the set. The fact that these mounds essentially overlap one another for the majority of teams indicates that they miss approximately the same number of serves in won and lost sets.

Screen Shot 2017-09-05 at 11.24.26 AM

There are of course a couple outliers. Cal Baptist and Hawai’i both show large effect sizes of service error in won/lost sets. A negative cohen’s d indicates an inverse relationship between service error % and winning the set; as one rises, the other falls. UCSD shows a medium strength relationship between the variables, but you’ll notice that all the other teams, including top teams such as Long Beach, BYU, UCI, and UCLA, all show small to negligible effect sizes for service error %.

So moving forward, unless you’re a fan of Cal Baptist (…unlikely) or Hawai’i (much more likely), don’t let those missed serves ruffle you. In the grand scheme of the set, remember that they’re likely negligible in terms of winning and losing.

Long Beach State In-System Atk (Men’s)

LBSU In-Sys

Just got access to the men’s side of the data so I’m still playing around with a few things.

What you see above is a data-driven visualization of what coaches might term “the key guy to stop.” In recent years with a team like BYU, the common phrase was “Taylor Sander is going to get his kills, let’s focus on the other guys.” So what I’ve built is essentially a histogram of what players hit in any set they appear in – and then color code   lost sets in red and won sets in…turquoise? *only sets in which players have 3 or more in-system hitting attempts count – and players must appear in a minimum of 30 sets during the mpsf conference season to be counted.

LBSU cohensd

What you’ll notice from the visuals is that yes, TJ DeFalco certainly has an impact between won/lost sets, it’s actually Amir Lugo Rodriquez who’s hitting efficiency carries the most weight. The likely reason is that if Amir gets going, LBSU can get pin hitters 1 on 1’s much easier as opponents move to front the quicks. Another possibility is that this doesn’t actually prove causation – and that Amir hits better when LBSU passes better. This is also fair, but again, we include the data if Amir has at least 3 attempts in the set.

I like this visual because it makes sense to look at – coaches can see that shift between won and lost sets, but to also include the actual cohen’s d and magnitude levels supplies additional statistical weight to the problem. I’d like to use this approach more frequently moving forward – in both the men’s data from the spring and the women’s data currently coming in from the Big Ten and Pac 12 this fall once conference kicks off.

Pac12 Pass Rating and W/L Sets

B1G Pass Rating in W/L sets

 

pacpassrate

Figured I would build a similar chart for the Pac12 from 2016. It includes all conference matches, not just the big ones. As with the Big 10 numbers, there are certainly some teams where pass rating doesn’t necessarily differentiate won/lost sets. By including all sets however, this may introduce more noise than we’d like – as a good team may lower their performance to the level of their opponent and still win the set.

pacpass

 

Here’s the same descending correlation data from the Pac as well. Washington State clearly takes the cake – while both UC’s seem to be unaffected by their ability to pass, interesting.

bigpassrate2

 

I’ve included the full Big Ten data – with all conference matches include as well. Overall there’s nothing super shocking, though I did have a chuckle with the Rutgers spike. You’re welcome to deduce why their curve has such slim deviation.

Here again is the correlation data for the Big when you include everything.

bigpassrate

Big Ten Pass Rating in W/L sets

pass rating

I’m not a huge fan of using pass ratings – I don’t believe they accurately value different reception qualities. That being said, every single person ever uses pass ratings, so I decided to dive into it a little. In the above chart, pass ratings are valued on a 3 point scale (R# 3, + 2.5, ! 2, – 1, / 0.5, = 0). Set data was only collected from the top 9 Big Ten teams in 2016 in matches in which they played one another (“big matches”). The average pass rating for the team in each set they played is compiled in these distributions – with WON sets in teal and LOST sets in red.

What you’ll notice is that for some teams in these big matches, their pass rating in the set really has no bearing on winning the set. Wisconsin is a pretty good example of this as their distributions basically overlap one another completely. To be fair, this may be an anomaly due to Carlini’s ability to turn 1’s into 2’s and 2’s into 3’s (by the transitive Carlini property, 1 passes equal 3 passes, boom!).

On a similar note, Michigan and Michigan State suffer from this same phenomenon in that if you handed Rosen or George their team’s pass rating for any given set, they would essentially be guessing whether they won or lost the set. On the other hand, if you looked at Minnesota or Nebraska, you’d have a much better chance of guessing correctly, given the pass rating in the set.

corplotAbove are the descending correlations between set pass rating and set won/lost. Again, these are only in “big matches” which may skew the results – yet at the end of the day, Hugh/John/Russ etc. are game-planning to beat the best. But what you’ll see is that for some teams, the statistic of pass rating is relevant to winning and losing sets. For others, there’s likely something else driving winning. My goal is to continue to poke around to find the unique fingerprint for each team and what drives or hinders their success.

 

Receive Eff in Big Ten

receiveeff

Similar idea – just messing around with the ggjoy package in R.

What you see above is the receive efficiency (basically passer ratings valued by Big Ten standards for FBSO eff). I filtered out a bunch of names that failed to consistently post high values – as well as those sets in which the passer received fewer than 4 serves. Players are ordered from top to bottom by increasing average receive eff overall (yes, I did this backwards – Kelsey Wicinski is our top performer here).

Similar to PS% performances, you’ll notice that the better passers (towards the bottom) have shorter and shorter tails off to the left indicating fewer sets of poor passing (as you’d expect). Nothing crazy to report here, just cool to visualize in this fashion.