Tell the Model What to Do, Not What to Avoid

I kept writing agent instructions that started with NEVER.

NEVER skip table-driven tests. Do NOT ignore benchmark coverage. FORBIDDEN to use global state. My skill documents in the claude-code-toolkit were full of these. Every time I hit a failure mode, I’d add another prohibition. The documents grew. The output quality didn’t.

So I ran blind A/B tests on Go code generation. One variant led with a dedicated constraints section up front, the other wove constraints inline within the workflow steps they governed. Multiple complexity levels. Reviewers scored outputs without knowing which variant produced what.

Workflow-first won consistently. Across simple, medium, and complex prompts, reviewers scored the inline version higher on testing depth, Go idioms, and benchmark coverage.

What I Changed

The constraints didn’t disappear. They moved.

Instead of a standalone section at the top listing everything the model shouldn’t do, each constraint sits inside the workflow step where it matters. And each one has a reason attached.

Not “NEVER skip table-driven tests” but “use table-driven tests here because they make adding cases trivial.” The difference is the model gets the reasoning. It can apply that reasoning to situations I didn’t think of when I wrote the skill.

I pulled out a lot of structural overhead in the process. Operator Context sections. Standalone Anti-Patterns sections. Anti-Rationalization tables. Capabilities & Limitations boilerplate. All of it separated constraints from the workflow steps they were supposed to govern. All of it gone.

The Linter

I built joy-check to catch this stuff automatically.

In instruction mode, it pre-filters with regex first. Catches obvious patterns like “NEVER” and “do NOT” without burning LLM tokens. Then it runs semantic analysis on what’s left, scoring each section 0-100 and suggesting action-based rewrites. The --fix flag applies them automatically. There’s a --strict mode that fails on any section scoring below 60. I use that during skill creation.

It runs automatically during skill authoring and agent upgrades. Before the document is ever used to drive a model, it’s been checked.

Joy-check started as a writing tool, actually. The writing mode scans for a different set of patterns: defensive disclaimers, paragraphs that are individually mild but accumulate into a prosecution brief, passive-aggressive factuality where the author presents facts in evidence order and says “I’ll let you draw your own conclusions.” I applied the same thinking to instruction documents and it stuck.

How Inline Constraints Generalize

When an instruction says “NEVER skip table-driven tests,” the model builds its task model around table-driven-test-skipping. The prohibition is the anchor. The failure mode becomes the central concept.

When the instruction says “use table-driven tests here because they make adding cases trivial,” the model anchors on the success path. And the reasoning, “because they make adding cases trivial,” gives it something to generalize from. If the model encounters a situation where another testing pattern would also make adding cases trivial, it can apply the same logic. Attaching reasoning lets the model generalize constraints to situations the skill author didn’t anticipate.

A prohibition just says stop. It doesn’t transfer anywhere.

Where This Probably Falls Apart

I started testing this on Go code generation, then expanded to other skill domains. The pattern held.

I can construct arguments for when explicit prohibition language might be necessary. Safety-critical constraints where the failure mode is genuinely what matters. Maybe “NEVER commit credentials” is the right framing because the success path isn’t the point, the failure is catastrophic and specific.

All of my testing has been on Anthropic models. I don’t know if other models react the same way to framing. They might. They might not.

I also don’t have a clean mechanistic explanation for why this works. LLMs trained on human text might just respond to the same framing cues humans do. Positive framing makes human writing more actionable. Maybe it makes AI instructions more actionable for similar reasons.

What I do know is that every agent instruction I write now leads with the action. The tests said something and I’m going with the tests.