ag8

About

runrl.com

1. ▲ Gourmand Syndrome (en.wikipedia.org) 27 points · 5 months ago · 9 comments
2. ▲ guys why does armenian completely break Claude (twitter.com) 99 points · 6 months ago · 65 comments
3. ▲ Sampling at negative temperature (cavendishlabs.org) 203 points · 6 months ago · 60 comments
4. ▲ Perfectly Replicating Coca Cola [video] (youtube.com) 1 point · 6 months ago · 1 comment
5. ▲ Po.ta.to (po.ta.to) 4 points · 8 months ago · 2 comments
6. ▲ Scaling pretraining affects RL sample efficiency (runrl.com) 1 point · 8 months ago · 0 comments
7. ▲ Systematically generating tests that would have caught Anthropic's top‑K bug (theorem.dev) 2 points · 9 months ago · 0 comments
8. ▲ Tinker (2b4fdb18.connectionism.pages.dev) 4 points · 9 months ago · 2 comments
9. ▲ Training Qwen to answer briefly yet intelligently using feedback control (runrl.com) 4 points · 9 months ago · 0 comments
10. ▲ Launch HN: RunRL (YC X25) – Reinforcement learning as a service (runrl.com) 71 points · 9 months ago · 22 comments