THN Analytics: An Introduction
Jim Corsi (Bill Wippert/Getty Images)
THN Analytics: An Introduction
Need a primer on the growing world of hockey advanced statistics? Look no further than this easy-to-understand breakdown of all the key terms and concepts.
When Tim Barnes made the argument in 2009 that “possession is everything,” he wasn’t announcing a unique conclusion that emerged from decades in a stats lab – he was quoting Mike Babcock, who was inspired by Scotty Bowman, who worked closely with Roger Neilson, who was inspired by Harry Sinden. Like any other hockey fan, Barnes looked to the great leaders and innovators of the past for ways to learn more about the game – in sharing his discoveries on his blog, Irreverent Oilers Fans, he helped spur the modern hockey stats movement. Barnes’s effort to quantify possession led him to a measure Buffalo Sabres’ goaltender coach Jim Corsi used to assess his players’ workloads (counting more than just shots-against, but also misses and blocked shots); ironically, Corsi had never intended it for anything but goaltending. Regardless, Barnes (an engineer by trade) used statistical testing to find, of all the widely available statistics, Corsi’s measure most closely mirrored possession.
Barnes also proposed that we can use Corsi to measure the possession ability of individual players – Sinden and Neilson had done the same by comparing scoring chance counts while a player is on the ice to when they are off, but Barnes preferred Corsi over Sinden and Neilson’s arbitrary scoring chance definitions. To aid this research, Barnes created an NHL game-by-game data collection resource, timeonice.com, while at the same time Gabriel Desjardins (also an engineer) developed behindthenet.ca to aggregate the Corsi and goal data to help us look at player and team season performances. In the five years since, hockey analytics has experienced a meteoric rise in fan consciousness, while NHL teams have engaged in a virtual arms race to employ the emerging talents of renowned stats writers. This emerging trend hasn’t come quietly; there have been many contentious debates over the merits of analytics, oftentimes involving mischaracterization on both sides of the argument. The truth is, insofar as there are “sides” in the debate, the sides are far closer than it appears. Supposed anti-analytics people actually quantify and count different things as important (wins, points, player grades or value relative to others), and proponents of analytics enjoy and watch the game with regularity, and often pore over hours of game footage. All want to better understand the game. Over the years, dozens of powerful minds have researched hockey analytics, and entertained a variety of arguments about what is worth counting and how to improve Corsi measures. A new statistic emerged, coined “Fenwick” by its creator Matt Fenwick, which left blocked shots out of the Corsi equation and seemed to perform better as a team measure than Corsi in larger samples. In the end, a few major conclusions of this research stood out, and they’re handy for any hockey fan to carry in their back pocket as they watch the game:
1. Regression – Let’s say you and a friend are at center ice, shooting a puck at the net. Your friend shoots twice and scores once. Do you assume that your friend is a 50 percent shooter from center ice? You probably say to them, “Bet you can’t do it again.” The regression principle in hockey analytics revolves around scoring and save percentage at even-strength (which comprises around 80 percent of all gameplay): the players shooting or stopping shots way above or below league-average tend to shoot or stop shots at much closer to the league-average in the future. There are exceptions to this rule at the “tails” of the sample: some players are good or bad enough to sustain shooting percentages up to ±3 percent from average, and save percentages up to ±1 percent from average.
While individuals can potentially maintain shooting or shot-stopping talent, it’s very difficult for teams to maintain well above league-average rates. A team’s even-strength shooting percentage plus a team’s even-strength save percentage, named “PDO” after innovator Brian Kings’ profile name, is used to indicate whether a team is unusually far above or below this average and thus due for PDO to push back towards the league-average (1.000, though usually referred to by whole numbers like 980 or 1032).
2. Possession – Possessing the puck, as Darryl Sutter once said, has always been the goal. The best current indicator we have for measuring possession in analytics is called “Fenwick Close,” “Close” a reference to tracking Fenwick only in tie games or when the game is within 2 goals in the 1st and 2nd periods. The top 3 “FenClose” teams in 2013-14 were the Los Angeles Kings, Chicago Blackhawks, and San Jose Sharks; the bottom 3 were the Edmonton Oilers, Toronto Maple Leafs, and Buffalo Sabres. Contrary to Babcock and Barnes’s assertion, possession is not “everything” to winning in a season; a team can maintain a slightly elevated shooting percentage or ride a particularly talented goaltender to wins. But in the long run, since teams pull towards the league-average, what more consistently differentiates a team will be its ability to possess or gain possession of the puck.
3. Deployment – Not all players play with the same teammates, or against the same level of competition. Deployment must be considered when looking at a player’s Corsi measure because it does have a bit of an impact; in analytics, this is often signified by measuring what frequency they start their shift in the offensive zone (Offensive Zone Start Percentage). The Corsi measures of teammates and opponents when the player is on the ice (named Quality of Competition and Quality of Teammates) is also considered. Coaches’ systems can have an influence on shot generation and suppression, but ultimately their greater impact is how they use their players (who they play against, who plays with them, how often they are on the ice, etc.).
4. The Problem With Using Goals – Goals are clearly the point in hockey, but a given game might only yield 2 or 3 goals for your team. Was that goal a direct result of skill? At times, yes, at times, no. Would a fluke goal count in your analysis? A powerplay or shorthanded goal? Remember “bet you can’t do it again?” Well, goals rarely offer enough “agains” with the same players in the same situation for us to say, “I think we can count on that happening again.” Corsi measures, instead of giving us 2 or 3 events to assess a team, grant us around 45-50 each game. Put another way: would you be more confident that you know your friend’s center-ice shooting ability after 2 or 3 shots, or after 45-50?
5. League Talent Spread – The NHL has been the same size, or nearly so, for the last 15 years. The differences between teams are becoming smaller, pushing them to find an edge wherever they can. As long as some teams ignore possession, that edge will continue to be the most glaring. League size is also why “shot quality,” as a concept, is largely muted. In a less-talented league, it might be a bit easier to simply take more shots closer to the net, but in today’s NHL, no matter the opponent, the closer you get to the cage the harder it is to get a decent shot off. You trade away shot volume, a function of possession, for percentage and a less reliable outcome. These five over-arching concepts are some of the more-important ideas to emerge from the modern hockey stats movement, but hockey stats continue to evolve. To date, adjustments to Corsi measures are being tested (a scoring chance study revealed that scoring chances and Corsi run extremely close together), the value of tracking the way teams enter the zone is becoming more widely accepted, and a variety of adjustments are used to address rink counting bias. As analytics continue to push towards conclusions like the ones above, and expand on the principles of Sinden, Neilson, and Barnes, we only get closer to a better understanding of the game.