Settings

Theme

Open-sourcing circuit tracing tools

anthropic.com

161 points by jlaneve 10 months ago · 20 comments

Reader

rob-olmos 10 months ago

Anthropic employees Sholto Douglas & Trenton Bricken did an interview recently with Dwarkesh Patel, pieces here and there was about the circuit tracing insights.

https://www.dwarkesh.com/p/sholto-trenton-2 -- search the transcript for "circuit" for the quick bits.

Eg, "If you look at the circuit, you can see that it's not actually doing any of the math, it's paying attention to that you think the answer's four and then it's reasoning backwards about how it can manipulate the intermediate computation to give you an answer of four."

https://transformer-circuits.pub/

Tostino 10 months ago

This type of stuff is really important in my opinion. Getting this type of stuff open sourced allows academics and other researchers to try and do this type of interpretability research on a more level playing field.

I think the more people looking at this the better. I have a feeling there will be some breakthroughs in identifying important circuits and being able to make more efficient model architectures that are bootstrapped from some identified primitives.

sanex 10 months ago

The conversation about this on Dwarkesh was interesting and I'm glad we're getting access to the tool.

https://open.spotify.com/episode/3H46XEWBlUeTY1c1mHolqh?si=L...

jexp 10 months ago

Imported the graph json into Neo4j

Have fun

https://gist.github.com/jexp/8d991d1e543c5a576a3f1ee70132ce7...

ofou 10 months ago

Is this Garcon [1], or a new tool?

[1]: https://transformer-circuits.pub/2021/garcon/index.html

Eduard 10 months ago

thought this was about PCB tracing and was disappointed.

qtwhat 10 months ago

Curious if we say "thank you", the model will be more activated and result in better answer. ^^

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection