Gradient Descent on Token Input Embeddings
lesswrong.comDoes performing gradient descent on token input embeddings lead to interpretable results? And if not, why?
Does performing gradient descent on token input embeddings lead to interpretable results? And if not, why?