llama + spec: MTP Support by am17an · Pull Request #22673 · ggml-org/llama.cpp

1 min read Original article ↗

@github-actions Bot added model

Model specific

testing

Everything test related

Nvidia GPU

Issues specific to Nvidia GPUs

Vulkan

Issues specific to the Vulkan backend

examples python

python script changes

server ggml

changes relating to the ggml tensor library for machine learning

labels

May 4, 2026

ngxson

Currently speculative checkpoint needs to restart from a checkpoint
after some draft tokens are not accepted, this leads to some wastage in
running the target again. This PR adds the ability to rollback upto
`draft_max` by storing the GDN intermediates.

@am17an

@am17an

@am17an

@am17an

@am17an

@am17an

basnijholt added a commit to basnijholt/dotfiles that referenced this pull request

May 7, 2026
Update comin, home-manager, and nixpkgs through nix flake update.

The newer nixpkgs ollama package already patches the OpenClaw launch tests to use store paths for coreutils. Our pc-specific 0.23.1 override was still applying the old duplicate OpenClaw substitutions, so patchPhase failed once the first replacement had already removed /usr/bin/env from openclaw_test.go.

Drop the duplicate OpenClaw substitutions and keep the Pi launch-test substitutions that are still needed by this override. Also leave a TODO near the llama.cpp pin to revisit Gemma 4 MTP support once ggml-org/llama.cpp#22673 lands upstream.

Verified with: git diff --check -- flake.lock hosts/pc/package-overrides.nix and nix build .#nixosConfigurations.pc.config.system.build.toplevel --no-link.