Raftpp_release

Created on April 15, 2025

2025

We wrote a report analyzing what makes GRPO “stand out” for math reasoning, with some understanding and ablation studies to compare different algorithms for LLMs reasoning training.