I suspect some of it is due to one of the fundamental parts of training, the reward mechanism. The models are trying to get things right, and if they can make data up that makes them "correct" they then think that is success. They are more eager to provide an answer than not, because that is their job.
If that is part of the issue, they would have to find away to lower the drive to give any answer, and raise the penalty for giving incorrect ones, such that the model is more likely to say it doesn't know. It's the balance between thoughtfully assembling low availability of information and intuiting what the answer might be vs saying "I don't have enough information to give you a reasonable answer to that."
|