Show HN: Empirical – test framework for JavaScript developers building with LLMs

4 points by arjun27 2 years ago · 0 comments · 2 min read

Reader

Hi HN!

This is Arjun and Saikat, and like other product engineers, we've been excited to build with LLMs. Getting powerful models available as off-the-shelf HTTP endpoints is a huge leap forward to integrate and ship ML to end-users.

While building on top of LLMs, we've also experienced the pain of non-deterministic behavior – especially for applications that require smaller models. Iterating through model configuration while ensuring no regressions across hundreds of scenarios is a tricky balance.

To make this easier, we built Empirical. Here’s a demo video: https://www.youtube.com/watch?v=p8gSGphcOSU

We've focused on:

- Fast iteration cycles and interactivity when you need to change the prompt or add a new sample. We wanted to build something that feels like “hot reload” for LLM development

- A capable UI that combines objective and subjective evaluation, since eye-balling outputs makes it easier to build intuition around model behavior

- Ability to customize which model to test, or how to score it — with JavaScript (or Python, if you really must)

- Embedded analytics for evaluation results, powered by DuckDB under the hood (more coming up on this!)

You can try Empirical today – with a one line CLI command – locally or on CI/CD. And oh, Empirical is 100% open source – so file an issue and we’d be happy to make it work for your use-case

$ npx empiricalrun

GitHub: https://github.com/empirical-run/empirical

Docs: https://docs.empirical.run/

No comments yet.

Settings

Show HN: Empirical – test framework for JavaScript developers building with LLMs

Keyboard Shortcuts