Settings

Theme

zone411

Karma
4,238
Created
15 years ago

About

https://twitter.com/LechMazur

10 LLM benchmarks: https://github.com/lechmazur/

https://www.linkedin.com/in/lech-mazur-69b70493/

Advameg (City-data.com) founder and CEO. AI startup founder.

Author: AI melody songwriting assistant https://melodies.ai

Author: Accurate COVID-19 county-by-county neural net case prediction model based on most data.

Recent Submissions

  1. 1. LLM Position Bias Benchmark: Swapped-Order Pairwise Judging (github.com)
  2. 2. Show HN: Buyout Game Benchmark: Multi-Agent Bargaining, Transfers, and Takeovers (github.com)
  3. 3. LLM Persuasion Benchmark: Multi-Turn Persuasion Between Models (github.com)
  4. 4. Show HN: LLM Debate Benchmark (github.com)
  5. 5. Show HN: LLM Sycophancy Benchmark: Opposite-Narrator Contradictions (github.com)
  6. 6. Show HN: LLM Round‑Trip Translation Benchmark (github.com)
  7. 7. Show HN: LLM Creative Story‑Writing Benchmark V3 (github.com)
  8. 8. Show HN: Mapping LLM Style and Range in Flash Fiction (github.com)
  9. 9. Pact: Head-to-head negotiation benchmark for LLMs (github.com)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection