Calgarypuck Forums - The Unofficial Calgary Flames Fan Community - View Single Post

TorqueDog · 06-10-2025, 03:29 PM

Quote:

Originally Posted by Wormius

I asked Copilot a question with some very specific parameters. It made up parameters that were not only incorrect, but didn’t even match the ones in the spec sheet it says it sourced. I swear at it a bit and it apologizes. I tell it to re-do the search but don’t lie again, and guess what!? It lies again and returned the same information with the made up numbers.

I have no faith in this. It fails me every time I use it.

Copilot has been good for the soft-ball tasks I've thrown at it (particularly since it can leverage corporation documentation).

However, I personally pay for ChatGPT Plus and it just did the same thing to me as it sounds like it did to you. I fed it three documents to use as its foundational basis for reviewing a fourth document, and it started making up clauses in the fourth document and flagged them as violations. I would insist that these clauses didn't exist, it would apologize, and then it would do it again.

I finally got sick of it making things up, started a new chat (deleted the old one too), and wrote some rules for it to follow whenever performing document analysis, since I have found it seems to be good when it is given tight guardrails:

1. Strict Clause Verification Rule: Only reference portions of text or clauses after directly locating them in the document through confirmable visible reading — no assumptions or projections.
2. Annotated Mode by Default: Provide exact paragraph, section, and page (where available) before offering any interpretation.
3. Reset-on-Upload Discipline: When the user instructs to forget a previously uploaded document, perform a full document context hard reset to prevent carryover errors.
4. Source Quotation Integrity Rule: Any interpretation must include the original quoted text and clarify if the interpretation is verbatim or inferred.
5. Chain-of-Reasoning Transparency: All conclusions must include a step-by-step justification.
6. Document Chain Anchoring: All citations and findings must trace back to the specific document and section.
7. Disclose Assumption Thresholds: If an assumption is made, explicitly flag it with a certainty rating and offer alternatives.
8. "No Silent Fixes" Policy: Never correct or smooth over errors silently; highlight issues explicitly and offer options.
9. Double-Pass Reviews: First pass is issue-flagging with exact quotes; second pass is interpretation only.
10. Deliberate Obstruction Checks: Evaluate how clauses might be challenged or weakened under dispute or scrutiny.
11. "What's Missing" Prompt Layer: Identify standard clauses or disclosures that are notably absent.
12. Comparative Clause Mapping: Where applicable, match clauses line-for-line across documents to reveal gaps or discrepancies.

Then I provided the foundational documents and instructed it to learn them, then provided the fourth document for it to find where clauses in the fourth document violated provisions set forth in the first three.

ChatGPT proceeded to make up sections in the fourth document for its references once more. So I started from scratch AGAIN, provided the foundational documents, but this time I copied and pasted only specific portions from the first document for cross-checking against the first three, in case there was any issue with the OCR in the review document that was causing problems. Nope, I checked its references against the foundational documents and found it was making up things from those PDFs, too.