Settings

Theme

Understanding reinforcement learning for model training from scratch

medium.com

2 points by rajman187 4 months ago · 1 comment

Reader

rajman187OP 4 months ago

An intuitive treatment of RLHF, TRPO, PPO, GRPO, DPO and RLAIF

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection