Quote:
Originally Posted by NegativeSpace
I’m similar. The best explanation I have found is here: https://lastwordonsports.com/hockey/...-guide-part-2/.
My understanding is that they have lots of data points that talk about the expected goals xg for any particular shot. That then is used to help evaluate the relative xG% for teams and players while they are in the ice. It does all go back to the model that looks at the xG for any particular shot at a moment in time.
|
That explanation is unfortunately far from the best
I have seen better explanations on CP
They say the law of large numbers means that they have the ability to firmly predict the probability of a shot going in, because they have so many data points
They fail to acknowledge the variance introduced by factors that they aren’t measuring though
Correct that xGF is basically a team stat. Every shot taken has a specific probability of being a goal. For every shot taken while you are on the ice, add up those shots times their probabilities.
Think of it this way
Average save percent in the NHL is .900 this year
Every shot has a 10% probability of going in
(This is the basic expectation and why lots of people say a goalie has a good or bad night based on their sv% that night
Goalie faces 30 shots? 0-2 goals allowed, good. 3, ok. 4 or more? Bad)
If a goalie faces 18 shots, on average, 1.8 goals go in
But the reality is that not all shots are equal
He could face lots of odd man rushes / breakaways and few muffins.
xGF refines that by sorting shots in to pretty basic buckets, depending on the circumstance of the shot
Consider the impact of where the shot was taken from
Rather than each shot having 10% chance of going in, maybe a point shot has a 1-2% chance. Perhaps a shot from the blue paint has a 20 percent chance, and from the slot, say a 12% chance, and a shot from a bad angle just above the goal line has a .5% chance. (All numbers just examples, each model will include its own division of the ice and adjusted probabilities)
Some models also tweak by shot type, whether or not it is a rebound, how long a time there has been since a previous event (ex. since the puck was passed to the shooter)
Sound good, right? Different situations, different locations, it tweaks the probabilities. It considers things that are measured about the play with the puck
What does it not consider? Defensive posture. Time and space. (ex. Was there a defender within a stick length? - that is something that Steve Valiquette is considering in his models)
Goalie position - there is a possibility that a model rates a cross crease tap in, into an empty net, the exact same as a guy in the crease with no room, stuffing the puck into a goalie’s pad. A shot where a guy has time and space can be the same as when a guy’s stick is tied up and the puck barely dribbles off it towards the goalie
In reality, the cross crease tap in may have a probability of going in to the net of say~75 percent, whereas the shot in tight with the goalie set may have a real life probability of near zero.
The problem is that the xGF model, as it only measures a few things, could rate these two shots with very different real life probabilities the same.
So it’s a pretty simple improvement intended to better predict how difficult a night the goalie has had. Yep, point shots are less likely to go in than shots from the slot.
But is missing things you can’t measure based on raw data (time and space, defensive posture)
Consider if a goalie faces 30 shots, most people generally expect him to let in about 3.
If all 30 are from the point, the xGF may be as low as, or even below 1
If the team gives up a ton of odd man rushes, lets the other team pass freely from behind the net into uncovered men in the slot repeatedly, you could see a higher xGF, maybe 4 or 5
A goalie could have absolutely no chance on 8 shots and let in 8 goals on 30. But the xGF model will never actually predict all 8.
The law of large numbers doesn’t help in this case. The fact is that the model won’t measure the things that make the difference, or capture the unique circumstances of each shot, based on its unmeasured attributes
Make sense?
That’s kind of what it does, and what it doesn’t do. Not as brief as I would’ve liked