Test your interpretability techniques by de-censoring Chinese models — LessWrong

1 min read Original article ↗

x

Test your interpretability techniques by de-censoring Chinese models — LessWrong