GitHub - samwho/llmwalk: Explore the answer-space of open LLMs

llmwalk

Explore the answer-space for any prompt and any MLX-supported model. See https://huggingface.co/mlx-community/models for supported models.

Instead of sampling from the possible tokens each step, llmwalk branches out and completes all of the branches the sampler would consider based on --top-k, --top-p and --temperature, ranking the results by probability as it goes.

The tree is walked prioritising the most likely branches, until it finds -n branches and then it stops. It doesn't enumerate all possibilities, just enough to know for sure it has found the -n most likely branches.

Usage

uvx llmwalk -p "In what year was Barack Obama born?"
uvx llmwalk -p "Write a haiku about compilers" -n 5
uvx llmwalk -p "Give me one word: " --top-k 200 --temperature 0.7

Options

-p, --prompt TEXT: Prompt to score (wrapped with the model’s chat template).
-m, --model MODEL: MLX-LM model identifier or path (default: mlx-community/Llama-3.2-1B-Instruct-4bit), supported models can be found at https://huggingface.co/mlx-community/models
-n N: Number of answers to show. The search stops once it has N finished answers and no unfinished branch can beat the worst of those N.
--min-probability FLOAT: Any branch whose cumulative probability falls below this is marked finished (low_probability) and not expanded further.
--top-k INT: At each step, expand at most k next tokens (highest probability).
--top-p FLOAT: Nucleus cutoff applied within the top-k tokens at each step (keep adding tokens until cumulative probability ≥ p).
--temperature FLOAT: Softmax temperature applied when computing per-step probabilities (1.0 is the model distribution; must be > 0).
--stats-interval SECONDS: How often to refresh the live view (<= 0 disables periodic refresh; still renders at start/end).
--format {csv,json}: Output format for machine-readable output. When specified, disables the interactive display and prints results to stdout when the job completes.

Machine-readable output

Use --format to get structured output for scripting or further processing:

# JSON output
uvx llmwalk -p "What is 2+2?" --format json

# CSV output
uvx llmwalk -p "What is 2+2?" --format csv

JSON output includes detailed token-level information:

[
  {
    "answer": "4",
    "probability": 0.95,
    "finish_reason": "eos_token",
    "tokens": [
      {"token": "4", "probability": 0.95}
    ]
  }
]

CSV output provides a simpler tabular format with columns: answer, probability, finish_reason.