Jiarui's Homepage

pic_jiarui2.jpg

Email: jiarui14 [AT] illinois [DOT] edu

I am a first-year CS PhD student in Siebel School of Computing and Data Science, University of Illinois, Urbana-Champaign (UIUC), with a great fortune to be supervised by Prof. Tong Zhang.

I obtained a bachelor’s degree of engineering (B.Eng.) from Yao Class, Tsinghua University.

My current research interests mainly focus on reinforcement learning, large language models, especially autonomous agents learning, and the interdisciplinary fields.

News

May 08, 2025 We released GVM - Gradient Variance Minimization, a framework to improve the data sampling efficiency in LLMs math reasoning. Starting from rejection sampling, we generalize our pipeline to RL algorithms like GRPO, and present corresponding theoretical analysis for our algorithm.
Apr 15, 2025 We wrote a report analyzing what makes GRPO “stand out” for math reasoning, with some understanding and ablation studies to compare different algorithms for LLMs reasoning training.
Mar 05, 2025 We release FANS - Formal Answer Selection for Natural Language Reasoning Uinsg Lean4, enhancing test-time math answer selection using formal language.
Aug 21, 2024 I started my PhD journey in the CS School of UIUC. :sparkles: :smile:

Selected Publications

  1. preprint
    Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
    Jiarui Yao, Yifan Hao, Hanning Zhang, Hanze Dong, Wei Xiong, Nan Jiang, and Tong Zhang
    arXiv preprint arXiv:2505.02391, 2025
  2. preprint
    A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce
    Wei Xiong, Jiarui Yao, Yuhui Xu, Bo Pang, Lei Wang, Doyen Sahoo, Junnan Li, Nan Jiang, Tong Zhang, Caiming Xiong, and  others
    arXiv preprint arXiv:2504.11343, 2025
  3. preprint
    FANS – Formal Answer Selection for Natural Language Math Reasoning Using Lean4
    Jiarui Yao, Ruida Wang, and Tong Zhang
    arXiv preprint arXiv:2503.03238, 2025

Latest Posts