Modeling the Penguins’ Shot Selection

Posted on February 7, 2019 by Sasank Vishnubhatla in Hockey, Sports // 0 Comments

In 2015, the Pittsburgh Penguins lead the NHL in shots on goal. With 2,722 shots on goal the entire season, or approximately 33.2 shots on goal per game, the Penguins started a new trend in the NHL: high volume shots. The next season, the Penguins once again led the league in shots on goal with 2745 total shots (33.5 shots on goal per game).To top it all, the Penguins have led the league in shots on goal for the past 3 years. The average NHL shots on goal per game in 2015 was 29.7 and in 2016 it was 30.2. The Penguins are averaging over 2 shots more per game, compared to the league average. Though a shot or two per game might not seem all that important, those extra shots can lead to better scoring chances.

Shots on goal and goals scored are related. The more shots you put on net, the more likely you are to score. By using statistics and probability to their advantage, the Penguins have been successful in the past few seasons. For example, in 2015, they scored on average 3.0 goals per game. That was third in the league, behind only the Dallas Stars (3.3 goals per game) and the Washington Capitals (3.1 goals per game). The league in 2015 averaged about 2.7 goals per game. The Penguins have used their quick skating ability and their new ideology of shooting pucks on net to evolve the game.

Examining the Penguins’ shots may lead to increased insight on how they can be even more effective on offense. With the 7th best power play (24.16%) in the league and elite talent like Sidney Crosby, Evgeni Malkin, Phil Kessel, and Kris Letang, the Penguins already have a potent offense. But their depth scoring has not been as prominent as they want it to be. Before being traded, Derick Brassard had only put up 9 goals and 7 assists for the Penguins. Another underperforming depth piece has been Tanner Pearson. After the Penguins acquired him for Carl Hagelin earlier this season, Pearson has been streaky. In his first 10 games as a Penguin, Pearson put up 4 points. In his last 9 games, he’s scored only 1 point. The Penguins’ depth is not shooting like their top six forwards. So, let’s determine how the Penguins should attack goalies by using machine learning.

In an effort to analyze how the league has evolved its shots on goal, I have created two machine learning models: a neural network, and a K-nearest neighbors model. A neural network is a machine learning model that is based on how the human brain works and outputs a regression value given an input. The inputs into this model are the following measurements: shot distance, shot angle, shot X coordinate, and shot Y coordinate. The data used to train these models is all regular season shot data from the 2015-2016 season to the 2017-2018 season. The K-nearest neighbors model is a model which takes an input and output’s average of the ‘k most’ similar data points. Just like the neural network, the K-nearest neighbors model used the following measurements: shot distance, shot angle, shot X coordinate, and shot Y coordinate. The output of both models is the predicted probability of the shot being a goal. Both models and a writeup of how the models were created can be found here.

Let’s see how the neural network predicts the Penguins’ shots.

Neural Network Output versus Shot Distance

This graph details the relationship between the neural network output and the shot distance. As the distance gets larger, the Penguins’ shots maintain a very low chance of scoring. However, as the distance gets closer to about 25 feet, the probability slowly increases. The darkness of the hexagon describes how many shots have the corresponding probability of being a goal. The high probability of shots very close to the net can be explained due to the large amount of tap-in goals and re-directions; therefore I choose to leave them out of the discussion of most optimal shot. The optimal shooting distance from this graphic seems to be near 14 to 15 feet, or the slot. Now let’s look at how the shot angle is related to the probability.

Neural Network Output versus Shot Angle

Here, we see that the Penguins just like to shoot pucks on goalies from any angle. We assume that a shot directly at the goalie is 90 degrees, and a shot from the goal-line is 0 degrees. The highest scoring chances come from shots from around 20 degrees, which is approximately the bottom part of both circles. This location is a deadly shooting spot since it requires an opposing goaltender to move across his crease.

The neural network’s predictions seem to match those of what a person would think: slot distance shots near the bottom of the circles. Let’s see if the K-nearest neighbors model says something similar.

K-Nearest Neighbor Output versus Shot Distance

The K-nearest neighbors model strongly believes that tap-ins and deflections are high probability shots. Though this is arguably correct, let’s remove that from our field of view and only look at shots above 5 feet. Looking at all of those shots, we see that there are multiple “high” probability shots: directly in front of the crease and shots from the point and high circle. I believe that the model states that these shots have an increased chance of scoring because the Penguins’ powerplay uses those locations to score. Hence, the model will state that shots from similar locations will also score. Let’s look at the angles now.

K-Nearest Neighbor Output versus Shot Angle

Similar to the neural network, the K-nearest neighbors model states that the Penguins should just shoot from anywhere. There are a few hot zones, like near the 20 degree angle, which would match the output of the neural network.

Just like the neural network, the K-nearest neighbors model shows that the Penguins shoot from around the circles. However, the K-nearest neighbors model believes that shots farther away from the goal could have a better chance of scoring. An explanation for this is the fact that the Penguins like to put a player, usually Patric Hornqvist or Jake Guentzel, in front of a goalie to create a screen. With a screen, shots from farther away have a good chance of beating the goaltender.

There are a two important takeaways from these two models: (1) the Penguins need to shoot from high probability distances (around 10 feet and 25 feet) and (2) the Penguins shoot from low chance angles and need to attempt more shots from high probability angles (25 degrees and 75 degrees).

Sasank Vishnubhatla

Sasank is a student at Carnegie Mellon University majoring in computer science and minoring in business. He is a huge hockey and baseball fan, but also enjoys following the Steelers and Spurs on the side. His passion lies in applying machine learning to sports analytics. As a mentor in the Carnegie Mellon Tartan Sports Analytics Club, Sasank does sports analytics research on hockey in order to bring more clarity to skater and goaltender performance. He can be reached on Twitter at @_svish.