Quote:
Originally Posted by Enoch Root
LOL. You're always so quick to suggest others aren't getting it, when it is you that is missing the point.
The 'point' of a model (as you suggested in the 2nd bolded comment) is to represent something, and the point of this model is to predict where teams will finish in the standings. jfresh has built a model with lots of inputs, but it does very little in actually predicting what it is attempting to predict.
The NHL standings are a fairly tight distribution, with about 2/3s of the teams finishing within about a 30 pt ban, each year. And the overall migration of individual teams, from year to year, is generally quite small (good teams remain good, bad teams remain bad, etc).
So a model that has an average error of roughly 10 points, is actually of very little value. And to demonstrate that to you, several people threw up EXTREMELY SIMPLE models, with only one or two inputs, and with no effort to add any actual analytical inputs to them, that were almost as accurate as jfresh's. In doing so, they clearly demonstrated that his significantly more complex model is a waste of time because it isn't getting results that are any better than the simple ones. That's the point, which obviously you completely missed to grasp.
But please, go on another rant, making an entirely different (and irrelevant) point, in an attempt to demonstrate that I and others have missed some point which you are trying to make - that always makes for fun reading!
|
Glad you're having fun, me too! But you seem mostly like you're doing your angry Enoch bit. This is a weird thing to continue arguing if you don't value it and don't care. I think it's cool and interesting, you don't. Good for you!
The JFresh model does exactly what it intends to do: predict the NHL standings for fun. Last year it was the least inaccurate (compared to betting odds, fan duel, the athletic, evolving hockey, moneypuck, or average fan submissions). On a rolling 5 year average, most of the actual models are similarly accurate, but much more accurate than hockeyviz or fan submissions.
The rolling 5-year average of Macindoc's model, for example, is 13.8. That's actually worse than the average fan submission, meaning a fan could predict the season outcome more accurately on average than Macindoc's model could. That would be an example of a useless/valueless model from an output standpoint. Your error of 11.7 or whatever you said it was would be about as accurate as an average fan submission last year, so again, useless. And because neither of your models have inputs of any value, there's nothing to learn, test, or apply, so they're completely pointless. My model, where I just took Macindoc's and made it more accurate (than any other model) for the one year, has a 5-year average of 12.5. Again, terrible and useless.
Take your suggestion that good teams remain good, bad teams remain bad. Let's do a model where we just predict that teams will all be the same in 2024/25 as 2023/24. Error rate? 12.3. Not so easy, eh?
Where you're also struggling is that you can't square the difference between a model that is a by-product and a model that is the main product. These analytical models exist as a
by-product of those analytics. They aren't tracking and measuring all the analytics with the purpose of predicting the season standings, it's just something that can be produced using all of those inputs. It has value, in terms of learning, entertainment, etc. If you could create a model that was near 100% accurate in predicting results, you'd probably make a lot of money. But you can't, so it seems silly to get upset at the most accurate models that exist because they aren't achieving something you can't achieve yourself and no one has been able to achieve. If you think it's a "waste of time" then stop wasting your time on it. Seems simple, no?
Everyone (almost, I guess) understands that hockey is a game with an incredible number of variables and advanced analytics are descriptive, not predictive, but can identify trends. This means that these models, while predictions, are relying on what has already happened and how players are trending. Trying to account for as many of these variables as possible
Betting odds for a game, for example, will favour one team over another for a whole whack of reasons. That doesn't mean it's a guarantee that the favoured team will win, it simply means that the team
should win, based on all the inputs. Saying it's useless or has no value ignores the fact that the team that should win most often does exactly because of those inputs. It's also why we all understand when certain things are "unsustainable" (win streaks, shooting percentage, shutout streaks, etc.). Teams and players can only defy the odds for so long.
Quote:
Originally Posted by Enoch Root
Exactly. We all understand what models do, and what this one is trying to do. The point is that it is not moving the needle in the slightest.
|
Some of us do. Some of you don't.
Quote:
Originally Posted by Enoch Root
Would your math teachers have given you a gold star if you created a model that wasn't any more predictive than a random guess?
|
They didn't give out gold stars in my high school. Maybe we went to different types, but I doubt you'd get a gold star at all if you didn't even understand what "random" means.
You're not the first to mention it either. If someone was interested in creating a truly randomized model and then testing its accuracy, that'd be fun, too. But choosing specific inputs and formulas is not "random." And as I have demonstrated, these models are, in fact, more accurate than your guesses.