Why does no one measure prompt velocity
commandbar.comBeen messing with LangSmith playground recently. Pains me to say but it’s actually been a pretty good experience so far.
That's awesome. I've been seeing quite a bit of chat about it on X too. Seems like they've hit the mark with playground. What are you using it for specifically?
super interesting how it makes "decisions", but nice that they let you tie user feedback directly into LLM refinement, otherwise would be hard to make that info useful
What do you mean by decisions?
I'm curious about LangSmith's 'dynamic datasets'. How does it ensure data integrity, especially when rapidly iterating on AI models?
From the docs it looks like they're fairly explicit with respecting env states for each dataset. I'm not sure how/where contamination would even occur to be honest - regardless of model used.
Love the concept of prompt velocity. Although it doesn't capture the initial quality of the prompt or whether the changes are effective.
Any thoughts on how they could improve here? Seems like that would be challenging.
What is prompt velocity? It's not mentioned in the blog post.
What’s LangSmith's revenue model?
I have no idea. They're still in beta so probably figuring it out as they go I guess. I could see them charging on tokens or traces most likely though.