Standings are aimed to be released by the end of every Tuesday.
In the 2018 Season, we have 29 technical upsets in 118 ranked matches. This past Saturday, we had two technical upsets in nine ranked matches, one of which was an Overtime upset.
Since our League was founded in April of 2005, to present we’ve seen thousands of matches between schools. All 50 schools that have played in the NCDA have been campus recognized club sports and/or student organizations. Over the eight seasons of reliably recorded matches, starting with the 2011 Season on 2010-09-25, we’ve played 1368 ranked matches. 265 of those are considered technical upsets in the Gonzalez System (success rate of 80.63%). A technical upset is when a lower rated team defeats a higher rated team.
Roughly 220 box scores from Seasons 2005 to 2010 are recorded, but we know there are more unrecorded matches, roughly estimated to be about 300 matches played in the statistical dark ages from Seasons 2005 to 2010.
The Gonzalez System is a computer ranking model similar to Elo and is a rating exchange system based on research performed by World Rugby. It has been adapted by the NCDA to the demands of College Dodgeball, but can be tuned and customized endlessly to incorporate accurate data. It has been used to help determine seeds for the Nationals bracket since Nationals 2014, and was used exclusively for the Nationals 2017 bracket.
Technical Upset Spotlight
A technical upset in the Gonzalez System is when a lower rated team defeats a higher rated team. The overall success rate of the system is currently 80.63% based on 265 technical upsets in 1368 ranked matches.
This is an important measure in the primary application of the system, that being fairly determining seeds used for the Nationals Tournament Bracket on Sunday. When the System was used exclusively for last year’s Nationals 2017, the system was correct for 21 of 22 matches. The one technical upset happened between two closely rated teams: #17 rating def #16 rating (38.737 def 39.794, rating gap of only -1.057). This upset was well within the normal, statistically healthy technical upset range, which any system needs to allow in order to be properly adaptive.
Based on this measure, it is my opinion the Nationals bracket was seeded fairly from top to bottom, though I consider it statistically reinforced.
An more important measure is how often statistical outliers occur, the goal being to keep these events down in order to increase the predictability of the system, which again is important for a fairly seeded Nationals bracket. Had the upset been a Significant Upset, it would have presented a problem in the accuracy of the system when applied to its primary focus: again, a fairly seeded Nationals Bracket. A Significant Upset has a Rating Exchange which is greater than two standard deviations from the mean of the whole population of upset exchanges, a Technical Upset in the 95th percentile. Currently only 13 matches in 1368 have been Significant Upsets, and none have occurred at a Nationals. Interestingly, four of these thirteen have been upsets decided in Overtime.
#2 – SVSU def GVSU 3-2 OT at 2016-10-29 Battle of the Valleys
#3 – WKU def SVSU 3-2 OT at 2013-02-16 BEAST II
#4 – BGSU def UK 3-2 OT at 2012-01-28 Kentucky Invitational
#12 – Ohio def Kent 3-2 OT at 2017-02-25 ODC
On to this past weekend’s [statically normal] technical upsets:
UNL def Midland 3-2 OT
38.132 def 38.570+1, exchanging 0.572
The School from Lincoln nabs an overtime win, their first and only overtime game across their history. Meanwhile, the School from Fremont is still finding their statistical feet. We’ll consider Midland’s rating provisional until they have about six games under their belt. Even overlooking that measure, this technical upset falls as one of those rating adjustment matches. The message threads indicate we’ll see a rematch this upcoming weekend with the School from Lincoln playing host to the School from Fremont. That’ll be telling to see if newcomer Midland can shake things up in the State of Nebraska.
This OT ranks 31 of 45 technical upset overtimes, out of a total 111 overtimes. In terms of normalized exchanges for technical upsets, it places 170 of 265 technical upsets, in the 35th percentile. A normalized exchange is a measure we use to compare technical upsets across the different factors that alter the end exchange, either being Overtime and/or the match occurring at a Nationals. As overtime halves the Rating Exchange and this match was during the regular season, this match would have exchanged 1.144 if decided in regulation. (((38.132-39.570)*0.1)+1) = 1.144
SU def UVA 4-2
36.660 def 36.681+1, exchanging 1.102.
Without the +1 home court advantage, the rating gap for this match would have been almost even. Having defeated UVA at home, Stevenson gains a tenth of a point more in this Rating Exchange. Based on the most immediate results from the past few matches, Stevenson’s rating is a tad undervalued enough that they are stronger than similarly rated UVA, but not strong enough to defeat higher rated Towson or VCU. This match ranks 200 in 265 technical upsets, in the 24th percentile.
Net Rating Changes
Rating Changes | Pre | Post | Change |
---|---|---|---|
UWP | 43.140 | 44.227 | 1.086 |
VCU | 40.649 | 41.632 | 0.983 |
Towson | 46.698 | 47.240 | 0.542 |
SU | 36.660 | 37.028 | 0.368 |
UNL | 38.132 | 38.147 | 0.016 |
Midland | 38.570 | 37.468 | -1.102 |
UVA | 37.463 | 35.569 | -1.893 |
UWP the top rating boost this weekend, but with two wins they only netted 1.086 (exchanging 0.556 with UNL and 0.530 with Midland). It could be a slight indicator that they are slowly pulling away from their “easy” to travel to neighbors, in a way similar to how JMU has done this season. Though it could also any other thought: it’s just great to see the teams of the Midway Conference playing, traveling, and competing more. I think it’s a key to growth of this Conference and the personal growth of the teams within.
Though Hunter may be covering some Chipotle meals across the League, let’s not overshadow that VCU netted the second largest rating gain of the weekend in their 2-1 posting. A TU / VCU rating gap of 5.267 is not insurmountable, and a VCU upset would have ranked #35 in 265 technical upsets, in the 87th percentile. That is however, a fairly predictable favoring for Towson. For the moment Virginia Commonwealth can say they had a bigger day than Towson, who only netted 0.542 over a three very predictable wins (exchanging 0.473, 0.059, 0.010 against those three lower rated teams). The next match between the VCU and Towson will still feature a decent stake for Towson, but their rating is increasing beyond its neighbors.
For a second straight event, SU has posted a net rating increase. That’s what winning games will do for you. It isn’t much, but small increases are worth it as they enter a closely contested section of ratings. Block 37 contains 7 team ratings, .857 and a close game from top to bottom.
Ratings, sorted.
Mov. | Rank | Rating | Team | W | L |
---|---|---|---|---|---|
— | 1 | 55.899 | CMU | 15 | 0 |
— | 2 | 53.235 | GVSU | 7 | 3 |
— | 3 | 48.483 | JMU | 7 | 2 |
— | 4 | 47.853 | BGSU | 11 | 4 |
— | 5 | 47.682 | Kent | 6 | 3 |
— | 6 | 47.240 | Towson | 14 | 3 |
— | 7 | 46.404 | SVSU | 3 | 6 |
— | 8 | 46.269 | UK | 4 | 1 |
↑ from 10 | 9 | 44.227 | UWP | 6 | 3 |
↓ from 9 | 10 | 43.868 | MSU | 2 | 6 |
↑ from 12 | 11 | 41.632 | VCU | 5 | 6 |
↓ from 11 | 12 | 41.525 | PSU | 3 | 3 |
— | 13 | 40.559 | Ohio | 7 | 5 |
— | 14 | 40.502 | UNT | 0 | 0 |
— | 15 | 40.139 | WKU | 0 | 2 |
— | 16 | 40.020* | ZAG | 1 | 1 |
— | 17 | 39.980* | OS | 1 | 1 |
— | 18 | 39.165 | UNG | 2 | 0 |
— | 19 | 38.586* | UWW | 0 | 0 |
↑ from 21 | 20 | 38.557* | NIU | 0 | 0 |
↑ from 22 | 21 | 38.404 | OSU | 1 | 6 |
↑ from 23 | 22 | 38.269* | SIUE | 0 | 2 |
↑ from 24 | 23 | 38.147 | UNL | 2 | 3 |
↑ from 25 | 24 | 37.885* | MC | 0 | 0 |
↑ from 26 | 25 | 37.803 | UMD | 1 | 8 |
↑ from 27 | 26 | 37.722* | Pitt | 0 | 0 |
↑ from 28 | 27 | 37.599 | DePaul | 2 | 9 |
↓ from 20 | 28 | 37.468* | Midland | 0 | 3 |
↑ from 30 | 29 | 37.295 | Miami | 4 | 3 |
↑ from 31 | 30 | 37.028 | SU | 3 | 8 |
↑ from 32 | 31 | 36.315 | Akron | 4 | 8 |
↑ from 33 | 32 | 36.169 | NSU | 0 | 0 |
↓ from 29 | 33 | 35.569 | UVA | 2 | 7 |
— | 34 | 35.442 | GSU | 0 | 2 |
— | 35 | 35.181 | CSU | 3 | 7 |
— | 36 | 34.449 | BW | 0 | 3 |
Movement as of 2017-11-07
* denotes a provisional rating (< 6 matches)
After the winter break, the * will also be applied to any team that has not played three games this season, the required minimum games needed to qualify for Nationals.
Also testing out the W/L columns; but this early into the season I don’t think it contributes useful information at this point. Other systems can rely on W/L systems better when every team plays the same number of games, but the very reason the Gonzalez System works so well for our particular needs is that it doesn’t rely on the same number of games to effectively evaluate Teams. I also think it detracts from the team’s present Rating, and clouds the user experience of viewing this chart.
12 above, 24 below. With new teams in the League, the League Mean Rating has fallen to 40.905 from 41.151 the previous month. This League Mean is a tangential measure of how we replace defunct teams across history. The more teams that play but fizzle out, the greater this number deviates from the Initial Rating of 40.000 that every team starts with. It will likely always teeter above 40 because the home court advantage modifier alters what is normally a “zero sum” rating exchange system. In any case, it’s always worth keeping an eye on and it of course divides the League in two, above and below the mean. Currently there are 12 teams above the mean and 24 below, with a rating spread of 21.449 from max rating and the min rating. An indicator that the League still has a noticeable level of disparity if measured by normal distribution of Ratings.
And boy, that rating spread from ratings #13 to #36 is just 6.109, but the rating spread from #12 to the top rating is a huge 14.374 points. A noticeable level of disparity in the top rated teams, and an even greater proponent for using team Ratings (55.899 vs 48.483) to evaluate teams instead of rigid Standing placements (#1 vs #3) which can greatly vary depending on which teams are being evaluated.
Your Input. Just as the system allows technical upsets to adjust ratings and improve the overall predictability, so to I welcome input in order to improve the system as a whole. What are your thoughts? Each week I try to write a bit about the system using the real examples we had occur that weekend. I also take special attention to explain concepts as if there were new readers coming into this article, or dropping explanations to previous work. Feel free to drop me a message, I’m always happy to answer any questions. See you next week.
Please keep in mind that there is a Exec Board vote scheduled on the NCDA Ranking Algorithm; so the Gonzalez system as it appears before you today may or may not be the single system we use for Nationals 2018 at VCU. We’ll release information on the vote as soon as it comes up.
Records, Master Spreadsheet: 2005-Present
Records, Individual Docs: 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018
Systems: Gonzalez Current, Gonzalez Old, Perrone, Champ, LieblichSpec Document: Gonzalez System Spec Doc