Quote:
Originally Posted by Russic
Out of curiosity, did you try this with o3 (base) or 4.1? I only ask because o3 seems far better at the more complicated high-stakes workflows and analysis, and 4.1 has a 1 million token context which could handle your documents better.
Apparently o3 pro (available at that $200/month tier) is blowing some pants off, but I don't have the money to try it out.
|
Looks like it was running on good ol' GPT-4o which is probably not great for this sort of thing. I'll have to give it another try using the other models, I didn't even think to run them through o3 or 4.1.
EDIT: Yup, WAY better on 4.1. It actually did what it was supposed to, with no hallucinations.