Calgarypuck Forums - The Unofficial Calgary Flames Fan Community - View Single Post

PepsiFree · 09-02-2025, 06:02 PM

Quote:

Originally Posted by Macindoc

I created a mathematical model (using no information whatsoever about hockey or the NHL, other than teams' points totals at the end of the 2022-23 and 2023-24 seasons) to create a model that had an average error of 10.5 for last season. The reason why the error was so high for my model was the disastrous season the Predators had last year - they accounted for 10% of my total error after dropping from 99 points and trending upwards to just 68 points, and they were probably the reason JFresh's predictions were not as good last season as they were the season before.

My model, if you're interested, averages the points totals from the two previous seasons (to reduce the effects of random events like injuries), then averages that number with the average number of points all teams obtained in the previous season (to account for potential regression to the mean), then adds half the difference between the points the team accumulated in the previous two seasons (to account for trends).

Not perfect, but remarkably close to the performance of the much more sophisticated JFresh model, without taking the effects of any individual players (trades, injuries, retirements, progression/regression/decline) into account.

I think you’re totally missing the point (you’re not alone, as Enoch and a couple others aren’t getting it either). The point isn’t to create the most accurate model by eliminating inputs and reducing the sophistication of the model. The point is creating a sophisticated model with these necessary inputs that is as accurate as possible.

For example. I “fixed” your model by removing the redundant addition of adding half the difference. Got it to 10.2. I “fixed” it further by adding the redundant addition back and reducing it to 1/8 instead of 1/2. Got it to 10.0. So the question is: how useful is the model? What do we learn from it if I can change one arbitrary number we just made up and make it more accurate? The answers are probably “not at all” and “nothing.”

Go back and apply your model to the year previous. 10.7. JFresh? 9.9. You’re solely trying to reverse engineer a model with the lowest error rate by ignoring as many inputs as possible, while he has a model with a laundry list of inputs and simply hopes it’s among the least inaccurate.

The point of the whole thing is the inputs. It’s a reflection of how a team should perform based on all of the inputs you ignored or eliminated. It’s more about the “why” and less about the result. The closer the result, the more we learn about the accuracy of the why and how (not the reverse). Without any why or how there’s really nothing to learn and no point to having developed a model in the first place.