Issues with the code - NFHN Reader

Here's a detailed analysis by GPT 5.3 Codex of the problems with the benchmark. It read the original paper and ran real tests, so this isn't just a hallucination.

1. Benchmark invalid under claimed flags

The README states results used -ffast-math (README.md:20). The benchmark warms up and then times repeated runs on the same graph/source (src/benchmark.c:166, src/benchmark.c:184).

ssp_duan is stateful across calls and only partially resets via dirty_d (src/dmmsy_opt.c:201, src/dmmsy_opt.c:209), relying on WEIGHT_MAX = inf checks (src/dmmsy_opt.c:108, include/common.h:10). Under -ffast-math, I consistently observed stale-state reuse after warmup, causing near no-op timed trials.

I observed and reproduced both regimes locally:

-O3 -flto (no fast-math): ~parity at 1M/5M (Dijkstra 180.5 ms, DMMSY Opt 178.1 ms).
-O3 -flto -ffast-math: absurd speedups (Dijkstra 180.8 ms, DMMSY Opt 0.0002 ms, >900,000×).

To test correctness, I wrote a custom cross-source check (separate from the repo's built-in test harness): warm up ssp_duan on source 0 several times, then switch to source 1 and compare its distance array against a reference Dijkstra run on source 1. Under -ffast-math, this produced 3,068 differing distances; without -ffast-math, there were no mismatches. The repo's own correctness checks (src/benchmark.c:76-77, src/benchmark.c:103-104) only compare at source 0 and do not exercise the warmup-then-different-source scenario, and the timing loop passes NULL outputs without re-verifying correctness during benchmarking.

2. Implementation is not faithful to the paper

This code does not implement Algorithm 3 (BMSSP) as described in the paper's top-level recursion. With S = {s} and len_src = 1 (src/dmmsy_opt.c:224), it enters its base branch immediately via the condition len_src <= p.k (src/dmmsy_opt.c:70) and behaves like a full Dijkstra-style relaxation. The paper's base case is a mini-Dijkstra that stops after finding k + 1 vertices; this code's base case runs full SSSP.

This is not the BMSSP algorithm from the paper — it is a different, stateful heuristic implementation.

3. Benchmark fairness issues even without fast-math

Baseline Dijkstra allocates/frees its heap every call (src/dijkstra.c:18, src/dijkstra.c:48), while ssp_duan reuses thread-local workspace (src/dmmsy_opt.c:18), giving a structural benchmarking bias.

Verdict

Benchmark validity: No (for the headline claim).
"20,000× seems impossible": Correct; it is an artifact.
Good implementation of the paper: No, not a faithful implementation of the published BMSSP framework.

1. Benchmark invalid under claimed flags

2. Implementation is not faithful to the paper

3. Benchmark fairness issues even without fast-math

Verdict

Sources