Settings

Theme

BSD_Evals: Open-source LLM evaluation tool

github.com

1 points by bsdpython 2 years ago · 1 comment

Reader

bsdpythonOP 2 years ago

How do you know which LLM is the best option to use for your particular use case? I published an open source repo to evaluate models based on your own set of prompts across Anthropic, Google and OpenAI. Besides model evaluation, it can also be useful for prompt engineering, API response time benchmarking and production application monitoring.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection