Settings

Theme

From vibes to data: measuring how LLMs attend to your prompt, layer by layer

github.com

2 points by taylorsatula 12 days ago · 2 comments

Reader

taylorsatulaOP 12 days ago

Most prompt engineering is done by changing words, running the model, and squinting at the output. Over the past few weeks I've built this toolkit which lets you measure what's actually happening inside the model instead.

You define regions of your prompt (instructions, examples, constraints, whatever), run the pipeline on any HuggingFace model, and get back per-layer attention heatmaps, cooking curves showing how attention to each region evolves through the network, and logit lens snapshots. Supports Llama, Qwen, Mistral, Gemma out of the box. Self-contained engine script you can scp to a GPU box and run with no dependencies beyond transformers. The repo is designed so that Claude can handle the whole pipeline end-to-end including interpreting results in a grounded domain-specific way.

I built it to tune system prompts for another project and realized the general approach was useful enough to extract. The "before and after" comparison tooling ended up being the part I use most.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection