Settings

Theme

Tokenizer UI for Mistral and Claude

superflows.ai

2 points by henry_pulver 2 years ago · 3 comments

Reader

henry_pulverOP 2 years ago

I use the OpenAI tokenizer UI a lot when prompt engineering.

Token count for inputs allows comparison of different data formats (YAML, JSON, TS) and is a crude measure of prompt importance weighting. For outputs it is a relative measure of output speed between prompts (tok/s varies by time of day) and a crude measure of compute used in outputs (why “Think step-by-step” works). Token count also determines the cost of a prompt.

Since there’s no equivalent for other providers, I built one for Mistral & Anthropic. If it’s useful, I can add other providers too - let me know which you’d like.

  • Zambyte 2 years ago

    Thanks for building this. Are the tokens different for the different models? For example, will the Mistral tokenization apply for both the 7B open model, and their propriety API only models?

    • henry_pulverOP 2 years ago

      On the tokenizers Mistral use for proprietary models, this isn't common knowledge.

      This tokenizer is correct for the 7B open model and 8x7B MoE model. It'll probably be the closest to the ones their proprietary API-only models use

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection