Where the LLM’s authority should end
A user on my WhatsApp bot paid two credits for a cover letter. The tool ran. A cover letter came back: 280 words, tailored to the role, the kind of thing you’d actually send.
Then the LLM saw the cover letter, decided it was too long, and sent the user a two-sentence paraphrase.
The user paid for 280 words. They got 40. The rest was thrown away.
This isn’t a prompt bug
You can’t fix it by asking the model more firmly. I tried. “Do not paraphrase tool outputs.” “Send the cover letter exactly as returned.” “The cover letter is the product, do not summarise it.” The failure rate drops. It doesn’t go to zero.
As long as the cover letter passes through the LLM on its way to the user, some percentage of the time the LLM will paraphrase it. That’s what LLMs do. They read text and produce new text. Asking an LLM to deliver content unchanged is asking it not to be an LLM.
The paraphrase rate is lowest on the flagship models, highest on the cheap ones, and unpredictable on any of them. It doesn’t matter. Any non-zero rate is too high for something the user paid for.
The fix is architecture, not language
Two moves, both in code. Neither in the prompt.
- Send the content directly. When the cover-letter tool returns, the server sends the full letter to the user over Twilio. The LLM is not involved in delivery.
- Poison the history. The tool result the LLM sees on the next turn isn’t the cover letter. It’s a string that says
[ALREADY SENT TO USER, do NOT repeat it]. The model has no way to summarise something it can’t see.
The tool completed. The user got what they paid for. The model’s authority over the conversation continues. The model’s authority over that specific piece of content ended the moment it was generated.
Where this applies
Cover letters are a small case. The same principle covers anything with real stakes.
- Receipts and confirmations. The transaction ID and the exact line items. A one-word rewording loses the audit trail.
- Generated documents. Contracts, invoices, reports, letters. Anything the user will download or forward.
- Billing amounts. £14.99 is £14.99. An LLM that writes “about fifteen quid” in the confirmation just cost you a dispute.
- Refunds and reversals. The tool processed the refund, the amount is final. The LLM shouldn’t be the one reading the number back.
- Anything the user can’t undo. Booking confirmations, submitted applications, sent emails. Once it’s shipped it’s shipped, and the wording is the record.
The rule: if the user would object to a paraphrase, the LLM shouldn’t be the one sending it.
Two roles, one system
In every AI product I’ve shipped, the LLM ends up doing two jobs that look like one from the outside.
The first job is deciding. Which tool to call. Which follow-up to ask. Whether to clarify or just act. That’s the part the model is good at, and it’s the part that makes the product feel alive.
The second job is delivering. Turning tool results into a reply. Writing the confirmation. Explaining what happened. That’s where it starts rewriting things it shouldn’t.
The fix is to separate the roles. Let the LLM decide freely. Constrain what it delivers. Anything with real stakes goes out through a code path the model can’t touch.
Router, not courier
A courier delivers what you hand them, unchanged. An LLM can’t be a courier. Its whole job is to transform text. The moment you put a document in the LLM’s hands and tell it to deliver, you’ve asked it to do the one thing it was built not to do.
Treat the LLM as a router. Let it decide where things go, when to act, what to say about outcomes. Don’t let it carry the outcomes themselves.
When the stakes are low (a follow-up question, a check-in, a clarification) let the model speak. When the stakes are high (anything the user paid for, or can’t undo) cut it out of the delivery path.
The bug reports stopped the day I drew that line. They haven’t come back.