Quote:
Originally Posted by TorqueDog
Looks like it was running on good ol' GPT-4o which is probably not great for this sort of thing. I'll have to give it another try using the other models, I didn't even think to run them through o3 or 4.1.
EDIT: Yup, WAY better on 4.1. It actually did what it was supposed to, with no hallucinations.
|
That's great to hear! 4o is SO good at creative writing, quick low-stakes questions, or creative brainstorming. Anything mission critical though needs to be kept far away from it.
What's great is OpenAI just made o3 way more efficient, so even us lowly plus users get about 100/week. I'm pretty confident if you started up your own business, o3 could handle many of the tricky, pitfall-rich "getting started" tasks for you (marketing plans, business roadmaps, competitor research, etc.).
Granted, bolting this stuff into existing infrastructure is not always a good time. But starting something from scratch? Lordy, what a different story in many cases.
As for the chess thing, it's mostly all fun and games. I worry a lot though that people are given a false comfort when they should be re-evaluating what they do. Not total freak out bash-each-other-in-the-skulls-and-dig-a-bunker, but maybe ask the question "what if my job as a call center support agent didn't exist in 2 years?"
So I worry that stories about hilarious AI failures, while entertaining, might be doing many of us a concerning disservice.