Prompts Are Guidelines. Code Is the Config.
First in a TIL — Today I Learned — series: short field notes on the thing that bit me, or finally clicked, while building. This one's from wiring up AI agents.
A good agent is like a sharp intern who's had three coffees and forty tabs open. Fast, genuinely capable, chews through a pile of context in seconds. And then on the 200th task of the afternoon it quietly skips the identity check, or "forgets" the formatting rule you gave it three times. It didn't misread the instruction. It rolled the dice and came up short, because that's what a probabilistic system does.
Here's the part that trips people up: the agent follows the instruction most of the time. It demos beautifully. Then it whiffs on the one request where it actually mattered. The fix isn't a longer prompt or a bigger model. It's deciding, step by step, whether the model or your code owns the outcome.
I think about it the way I think about a network. Some things you ask people to do. Some things the config enforces whether anyone remembers or not.
1. Judgment decides, code guarantees
Every agent has two layers. Claude gives you judgment — good for ambiguity, language, picking the right tool for a vague request. Usually right. Never guaranteed. Your code gives you determinism — it does exactly what it says, every single time. The whole job is deciding which layer owns each step.
The network version: a prompt is the email you send your operators — "remember to apply the inbound ACL." Most of them, most of the time, will. The config is the route-map and the prefix-list. The packet doesn't move unless it matches. You don't ask the network to be secure. You configure it to be secure.
Where people go wrong is writing a hard requirement — a spending cap, an identity check — into the prompt and calling "usually" good enough. If a step has to happen every time, it belongs in code: a hook, a gate. Not a sentence in the system prompt.
2. Stop reading the model's mind — branch on stop_reason
A common failure is grepping the model's text to decide whether it's done. Don't guess your agent's state from the words it printed. The API hands you stop_reason. Use that.
One catch worth knowing: a single turn can hold both chat text and a tool call — "Sure, let me pull that up…" followed by a JSON tool block. If your loop only watches for text, you cut the agent off before it runs the tool. Branch on stop_reason instead:
| stop_reason | What it means | What you do |
|---|---|---|
tool_use | Model's mid-task; it asked for one or more tools. | Run the tools, append the results, loop. |
end_turn | Model finished on its own. | Stop the loop, return the response. |
max_tokens | Response hit the output limit and got cut off. | Treat as an error. Don't ship a truncated answer as if it's complete. |
3. The "just write a better prompt" trap
When an agent fails a compliance check, the reflex is to strengthen the prompt. A better prompt is real — it might take a 5% failure rate down to 0.5%. It will not take it to zero. And for a hard security, policy, or money rule, 0.5% is just a breach waiting for a busy Tuesday.
Watch the confidence trap too: asking the model to rate its own confidence. Self-reported confidence is one more probabilistic guess, and it skews wrong-high on exactly the cases where the model is already failing.
You don't fix this with a better adjective. You fix it with a lock:
tool_choice— force a specific tool so a step can't be skipped. Setting it to{"type": "tool", "name": "verify_identity"}runs the identity check before anything else can happen.- PreToolUse hooks — code that runs before a tool fires. It can block the call (exit code 2) and hand back a message the model reads to understand why it got rejected.
- PostToolUse hooks — code that runs after a tool, to normalize what comes back (force every timestamp to Unix epoch, say) so the model always sees a consistent shape.
Rule of thumb: if it's money, security, or hard policy, reach for a hook or a tool_choice constraint — not a stronger sentence.
4. The agent doesn't actually remember anything
The API is stateless. It remembers nothing from the last turn. The "memory" your agent seems to have is bookkeeping you do by hand — you resend the whole history every loop.
Each iteration, you append two things:
- The assistant turn — its text and its
tool_useblocks. - The matching
tool_resultblocks.
If the model fires several tools in one turn, every result has to carry the right tool_use_id. That ID is the only thread tying a result back to the request that asked for it. Drop the IDs — or forget to include the model's own tool calls in the history — and the agent loses its place and starts making things up.
5. A valid shape is not a true answer
JSON schemas are great for machine-readable output. They are not a guarantee of correctness. A schema enforces syntax — the data is the right shape. It says nothing about semantics — whether the content is true. The model will happily hand you a perfectly valid invoice where the line items don't add up to the total.
Two things help:
- Nullable fields. If a field is required but the source doesn't have the value, the model will often invent one just to satisfy the schema. Let it return
nulland it'll stop fabricating under pressure. - Structured errors. When a tool fails, don't return
"Error". Return something your code can read —isError: true,isRetryable: true|false— so your orchestration can decide whether to retry a flaky timeout or stop cold on a permission denial, instead of making the model guess.
The whole thing in one question
Reliable agents come down to one question you ask at every step: does this have to be right every single time? If the honest answer is "usually," let the model judge it. If it's "must," pull it out of the prompt and put it in code — a hook, a forced tool, a gate it can't route around.
A prompt is a memo. Code is the config. Decide which one you're betting on before it's the 200th task of the afternoon.