Show HN: Empirical – test framework for JavaScript developers building with LLMs
github.comHi HN!
This is Arjun and Saikat, and like other product engineers, we've been excited to build with LLMs. Getting powerful models available as off-the-shelf HTTP endpoints is a huge leap forward to integrate and ship ML to end-users.
While building on top of LLMs, we've also experienced the pain of non-deterministic behavior – especially for applications that require smaller models. Iterating through model configuration while ensuring no regressions across hundreds of scenarios is a tricky balance.
To make this easier, we built Empirical. Here’s a demo video: https://www.youtube.com/watch?v=p8gSGphcOSU
We've focused on:
- Fast iteration cycles and interactivity when you need to change the prompt or add a new sample. We wanted to build something that feels like “hot reload” for LLM development
- A capable UI that combines objective and subjective evaluation, since eye-balling outputs makes it easier to build intuition around model behavior
- Ability to customize which model to test, or how to score it — with JavaScript (or Python, if you really must)
- Embedded analytics for evaluation results, powered by DuckDB under the hood (more coming up on this!)
You can try Empirical today – with a one line CLI command – locally or on CI/CD. And oh, Empirical is 100% open source – so file an issue and we’d be happy to make it work for your use-case
$ npx empiricalrun
GitHub: https://github.com/empirical-run/empirical
Docs: https://docs.empirical.run/
No comments yet.