2018 Service Error (Men’s)

It’s been about a year since I posted, so I figured my unpaid sabbatical from this blog should come to an end soon. I snatched the 2018 men’s data, parsed it, and decided that the controversial (if you were to survey volleyball parents) topic of service error should be the first thing I dive into a little.

*all analysis will be done only using the top 10 teams from the final AVCA poll and will only include matches when a top10 team played another top10 team.*

Service Error % vs. Ratio of Won Matches Against Top10 Teams

We’ll start with Service Error and its relationship with Winning or Losing the Match. You’ll quickly notice that LBSU doesn’t lose a lot of these big matches – and that they, Hawai’i, and UC Irvine were relatively low in service error this season. But in the bigger picture what you see is that a team’s service error percentage and their ability to win top10 matches is actually almost uncorrelated. Just looking at the chunk gathered around 50% of matches won level, you’ll notice huge swings in service error% for those teams – an indication that SE% really is hardly correlated to winning and losing in these matches for these teams.

1

And if we alter the analysis to look at Sets Won or Lost in a top10 vs. top10 matchup, the trend holds about the same.

Opponent FBSO Eff vs. SE % per Match

The next chart we want to look at is how your team’s Service Error %  (x-axis) affects your opponent’s FBSO (y-axis). This is probably the best way to get the most bang for your buck in analyzing your team’s serving ability. While of course we are failing to standardize the level of play (Long Beach plays more top10 teams than Lewis does) and take into consideration rotational advantages (Long Beach’s Josh Tuaniga serves with a strong trio of blockers in front of him), this method of analysis remains pretty solid overall. In the viz, the label is the team who is serving and the goal is to have your team appear lower on the y-axis, indicating that your opponent is struggling more in the FBSO phase.

*we use FBSO efficiency (Team A 1st ball sideout – Team A 1st ball errors)/(Team B total serves) because a missed serve will be an automatic sideout, but a tough serve may force an attack error by the receiving team. The same reason we like attack efficiency rather than Kill Percentage – we want both sides of the coin, positive and negative since a point is scored each rally.

So what we see is that Long Beach (2018 Champs) consistently appears towards the bottom of the graph, indicating a strong ability to slow teams from siding out on the first ball. You’ll also see that their name doesn’t appear on the right half of the graph (higher service error) very often at all. Weird. UCLA on the other hand, captained by National Team Coach John Speraw, unapologetically takes the position that stronger serving is a must and the errors that come with it are acceptable. Obviously this line of reasoning isn’t total bs since the Bruins were the ones who almost defeated Long Beach in the national championship match. But we’ll spend more time looking at these two styles later on.

Opponent FBSO Eff vs. SE % per Set

Above is the same graph except broken out into Service Error% per team for each set. Just as with the per match data, there is a weak positive relationship between an opponent’s FBSO eff and your team’s service error %. If you just look at the 20% SE line, you can see that there are 19 sets alone that had 20% SE and have a wild range of FBSO Eff from over .700 to below -.100. Because this trend holds at most values of service error, the correlation cannot be described as anything above weak.

Opponent FBSO Eff vs. SE % per Season by Player

Here are the servers from 2018 on a top10 (in a match against another top10 team) that had at least 50 serves – making them regulars back at the service line.

Opponent FBSO Eff is again the metric we judge these players upon – and yes, as we mentioned previously, a player like Tuaniga might serve with DeFalco, Amado, and Ensing blocking so there may be some built-in bias…

Top servers will be lower on the chart and I’ll let you take a look on your own.

Opponent FBSO Eff vs. SE % per Match by Player

Opponent FBSO Eff vs. SE % per Match by Player (FBSO < .250)

Above are two charts, the top being how players perform on a per match basis (with 5 or more serves in the match). The bottom chart is just a zoomed in view for performances which held an opponent to an FBSO Eff of .250 or lower while the specific player was serving.

Top8

Finally, here is a per-team breakdown on servers. The trend lines are the correlation between SE% and Opponent FBSO Eff (more vertical meaning a more positive relationship). You’ll see the top servers on each team at the bottom again, as they hold opponents to the lowest FBSO Eff – regardless of their service error levels. These charts are for the entire 2018 season (in top10 vs. top10 matches). It becomes pretty clear for some servers why their FBSO eff numbers are so terrible as they are consistently located in the 30+% service error realm. Even if a 30% SE player was at a 10% ace level (an elite level in these top10 matches), the opponent would still be at .200 before you even consider how often receiving teams FBSO anyway. Missing your serve at that rate just makes being effective super difficult.

The final two teams are listed below:

ucla-lbsu

Here’s the strategy we were discussing earlier, visualized. UCLA in blue, LBSU in gold. You’ll notice that just on average, LBSU is getting teams into a worse position in terms of FBSO Eff. UCLA’s opponents average around the JT Hatch level of .400 FBSO Eff, partly due to the higher error from the service line from the Bruins. The interesting thing though is that we kind of get two clumps of servers, visualized below:

2

With cluster 1 being high error servers with an average FBSO Eff around .420, while cluster 2 consists of lower error servers with, revolving around the .300 level or so. Naturally the majority of cluster 2 players are from Long Beach while cluster 1 primarily consists of UCLA kids. Yes, you could say that if Arnitz and Hess could just make more serves, the graph wouldn’t be skewed this way, but reality is that only Micah Ma’a is in LBSU territory in terms of effectiveness.

So I’ll leave you to draw you own conclusions, but just know that Long Beach only hit Jump Floats 17% of the time, while UCLA hit them 23% of the time. So it’s not that UCLA is hitting 100% Jump Spin and therefore missing more often – therefore the answer may lie elsewhere.

Possibly it’s LBSU’s use of changing the pace of the serve perhaps more than UCLA? Perhaps their athletes are more competent in the skill or less afraid to fail or can manage a poor toss better?

Personally, I think it’s because in these top10 matches Long Beach holds opponents to a 0.257 attack efficiency while UCLA holds them to a 0.308 efficiency. 50 points is big difference. So yes, the two teams have differing styles of serving, I believe it’s the block/defense portion that leads to the even larger split in opponent FBSO Eff.

If anyone who actually reads these posts has any specific type of analysis you’d like to see presented, just shoot me a note!

Advertisements

MPSF Service Error (2017)

fbso sepercentage

Been waiting to dive into this for a while now, so let’s get right to it.

What you see above is the Service Error % in a given set and the opponent’s FBSO efficiency: (opp. won in FBSO – opp. lost in FBSO)/total serves. The teams you see as labels are the serving teams, so the bottommost point (USC), means that USC missed around 4% of their serves in that set and held their opponent to -0.115 or so in FBSO. Pretty impressive.

As you’ll see, the blob of data certainly trends positively, indicating that higher service error is associated with, but does not necessarily cause higher opponent FBSO eff. The R-squared of this trend line is only around 0.13, which is pretty mild. This would suggest that you can be successful at limiting your opponent’s ability to FBSO, even at a higher level of error (say 20-25%), as there are teams like UCLA (lowest UCLA dot) who missed around 24% of their serves in the set, but still held their opponent to a negative FBSO eff.

mpsf se wonlost 2

So the next question for me was: if service error doesn’t have a league-wide trend, does it help/hurt individual teams more than others? That’s what the above graphic helps to drill into. Similar to previous charts, blue/teal indicates the team won the set, red means they lost. The curves are of frequency distribution – meaning that for UC Irvine, the highest frequency of won sets occurred primarily in the narrow range of 18-23% service error while fewer won sets occurred outside this range – whereas Stanford’s curve for the bulk of won sets occurred in a wider range from 5-20% with only a few sets won outside this range.

The hypothesis of the casual men’s volleyball observer might be that higher levels of service error would of course manifest more frequently in lost sets, yet what we see is that for most teams, it doesn’t make a difference in terms of winning/losing the set. The fact that these mounds essentially overlap one another for the majority of teams indicates that they miss approximately the same number of serves in won and lost sets.

Screen Shot 2017-09-05 at 11.24.26 AM

There are of course a couple outliers. Cal Baptist and Hawai’i both show large effect sizes of service error in won/lost sets. A negative cohen’s d indicates an inverse relationship between service error % and winning the set; as one rises, the other falls. UCSD shows a medium strength relationship between the variables, but you’ll notice that all the other teams, including top teams such as Long Beach, BYU, UCI, and UCLA, all show small to negligible effect sizes for service error %.

So moving forward, unless you’re a fan of Cal Baptist (…unlikely) or Hawai’i (much more likely), don’t let those missed serves ruffle you. In the grand scheme of the set, remember that they’re likely negligible in terms of winning and losing.

Point Scoring% in the Big Ten

pspercentage

Not a sexy topic, but I just figured out how to do these ‘joy division’ charts in R so I’m kinda pumped to share.

What you see is a histogram of each team’s point scoring % in every individual set they played (only against the teams you see listed, so Purdue v. OSU but not Purdue v. Rutgers).

They’re ordered in ascending fashion by their average PS% in these sets. Something which interested me was the shape of top vs. medium teams. Nebraska and Minnesota seem pretty consistent set to set in how they PS – yet as you work down the chart, you’ll notice some teams flatten out or even have multiple peaks. The latter is especially comical because teams in the middle of the Big Ten could often be described as “dangerous” – sometimes they’re red hot and other times they’re pretty self-destructive. Multiple peaks would certainly play into this narrative and I would be interested to see if other metrics manifest in these patterns, specifically amongst the middle teams in the conference.

And to answer the question nobody asked, yes, Nebraska had a single set where they point scored at 0% (OSU set4) and one where they PS’d at 73% (PSU set5) – that’s why those outliers give the Nebraska chart wings.

Quick thoughts; serving

Was just messing around with some numbers this afternoon and wanted to share.

I looked at a few things related to serving, specifically serve error%, point score%, and serve output efficiency. I ran some correlations between these stats and themselves as well as with winning the set overall.

As with my last post, I'm only using data from the top 9 in the Big Ten from 2016 so the calculated efficiencies are based on these matches alone.

Serve error% and winning the set came out to -0.150, pretty weak – and a disappointment to parents and fans everywhere who'd like nothing more than for you to quit missing your damn serves.

Winning the set and serve output eff (like pass rating but using the actual efficiencies off each possible serve outcome) clocked in at 0.323

And serve error% and serve output eff correlated at -0.546, the highest result I found. This seems to reiterate that terminal contacts skew performance ratings. So quit missing your damn serve! but at the same time, it's unlikely you'll have missed serves to blame on their own for losing a set.

Point score% and serve output eff came in at 0.474, which makes a lot of sense – it would be interesting to see if serve output eff is the largest factor in whether you point score or not.

Finally, because everyone likes service errors, I did SE% and point score% which resulted in -0.220. Again, pretty mild – suggesting that while the association is negative, as we'd expect, teams can still point score well even if they're missing some serves.

Anyway, just wanted to jot these numbers down before they get lost in a notebook somewhere.

Attackers’ Trends + Visualizing Development

attacker trends.jpg

Here are how four of the key outsides with the top teams in the Big Ten looked from the start of conference play until their respective seasons ended. Output Efficiencies are calculated using data from both the Big & Pac 2016 seasons and look at not only the kills/errors/attempts, but also the value of non-terminal swings. In this case OutputEff differentiates between a perfect dig by the opponent and a perfect cover by the attacking team – or a poor dig versus a great block touch by the opponent – etc. In this sense it’s better than traditional “true efficiency” in that it’s not just about how well your opponent attacks back after you attack – but it also appropriately weights different block touch, dig, and cover qualities as to their league-average value.

What you see above is the trends of these outsides over the course of the season. Foecke continuously improves as the season, as does Haggerty for Wisconsin. Frantti is interesting in that she actually declines up until early November then turns it on as PSU approaches tournament time. Classic Penn State. If Wilhite didn’t hit for over “.600” early in the season, she wouldn’t look like she’s trending down – but you have to keep in mind that her average (just north of .300) kinda blows people out of the water when you look at her consistency.

Personally, while I think this type of stuff is mildly interesting and you can definitely spin a story out of it, it’s not actionable in the sense that it’s going to help a coach make a better decision. However, this same principle could and probably should be applied on an in-season basis to look deeper at the development of players and specific skills. For example, high ball attacking:

swatk.jpg

You could build something like this for every day in practice. If you goal is to pass better, cool, let’s take your data from practice and graph it for the week and see if whatever changes we’re trying to implement have had their desired effect. Or let’s see if the team is improving as a whole as we make these specific changes:

mnpass.jpg

*the asterisk on 10/29 is because volleymetrics coded both MN matches from that week on the same day, so the date on the file for both says 10/29. That’s why we use Avg. Output Eff.

Anyway, there are thousands of ways to implement something like this – and then turn it into some digestible and actionable for the coaching staff.

Which type of serve is best?

Bears. Beets. Battlestar Galactica.

serve type.jpg

What you see above is the distribution of serving performances per player per match, broken down by type of serve. This chart is built using Big Ten and Pac 12 conference matches and serving performances with fewer than 5 serves in a match were excluded. 1st ball point score efficiency is the serving team’s wins minus losses when defending the reception + attack of their opponent. It’s basically FBSO eff from the standpoint of the serving team, which is why most of the efficiencies are negative, as the serving team is more likely to lose on that first attack after they serve.

You’ll see from the viz that the natural midpoint for all types of serves is around -.250. So the argument then becomes, well if you’re going to average about the same result regardless of what serve you hit, what does it matter? What matters here is the deviation from the mean. If you look at jump floats, it looks like the classic bell-shaped normal distribution graph and if you searched for specific players, you could see how their performances shake out relative to the average of the two leagues. If a player consistently fell below this average, maybe it’s time to develop a new serve or dive deeper into her poor performance.

Jump serving, as you might expect, definitely has a good percentage of players with performances above the mean. However, there’s also a wider distribution in general and because of this (likely due to increased service error when jump serving) many performance fall far short of league averages. The takeaway here is that while it can be beneficial, the larger standard deviation means you might only want to be jump serving if you need to take a chance against a stronger team.

Standing floats are interesting. Close and far just indicate where the server starts, relative to the endline. Molly Haggerty with Wisconsin hits a “far” standing float while Kathryn Plummer out of Stanford hits her standing float just inches from the endline. Not only is the average for standing floats farther from the endline a little higher (-.243) than standing floats from close to the endline (-.257) but as you can see from the chart, these far away floats are more narrowly distributed, indicating more consistent performance.

While jump floats have the highest average (-.229) and jump serves (-.264) may provide the appropriate risk-reward for some servers, it may actually be these standing float serves from long distance that provide a great alternative if you have a player lacking a nicely developed, above-average serve.

False. Black bear.

Top Servers in the Big Ten

servetop.jpg

Coming back from the SSAC in Boston this past week, I’ve been putting more thought in player evaluation against the market they’re situated in. Much like how baseball uses WAR (wins above replacement) to compare players’ values against that of an average MLB player. That stat has of course evolved over the years with different measurements for position players and pitchers, but the underlying principle has remained constant.

Looking at volleyball, there are 6 (7 if you count freeball passing) discrete skills so a single skill WAR metric makes a little less sense, but the general philosophy can be applied as a way to compare performance against league expectancies.

So in the above viz, I’ve used the league average PS Eff for each of the receive qualities. Which looks like this table below:

Screen Shot 2017-03-11 at 4.12.43 PM

Service Ace on the top, working down to service error at the bottom. And yes. Service ace should absolutely be at 1.0 and I’m not sure why it isn’t, but .998 and 1.0 are pretty darn close for our purposes at the moment.

Using these numbers, we then look at the frequencies a player served and got each of these specific outcomes. Multiply frequencies by efficiencies, add them up, and divide by the number of serve attempts and voilà!!

I’ve built this viz to again look at the relationship to service error percentage (while highlighting the top servers). You’ll notice there’s a slight negative relationship between effective serving and lower service error, but it’s not definite. Especially when looking at the servers who bring the most value, there’s certainly a range of error in that group – and almost a correlation of 0 if you draw a box between SSS, Davis, Swackenberg, and Kranda.

However in a general sense, you can assess the value each server brings to the table based on what her results are worth against the league average. In this case, Kranda comes out on top as giving you the best shot to point score.

Clever folks might be wondering what her breakdown looks like in terms of percentages of each outcome. So voilà again!

Book1

^ Here are the top 4 servers’ breakdown by each of the outcomes.

What you’ll notice is that they all have a unique footprint. Kranda makes her money by serving aces (around 14%) whereas SSS lives in the consistently good realm. SSS only misses 2% of her serves. That’s a huge deal. She keeps consistent pressure and even though her sum total of ok+good+perfect passes is higher than the others, she doesn’t give up free points, which results in her being the 3rd best server in the Big Ten in 2016.

I’m just starting to look at the data from a “what’s it worth relative to the league” type of standpoint, so I’ll likely have more posts like this soon. Previously, I’ve focused more on “what’s a player worth to her team” and specifically “what’s a player worth to her team, in this context, when playing opponent X.” I think the way I’ve approached this previously has merit, especially since we don’t have a marketplace for trading players like professional teams do – but you could easily evaluate All-Americans and other interesting things by comparing players to league data.