Yang Li (郦洋)
Ph.D. Student @ ReThinkLab, School of AI, SJTU.
About
I am a Ph.D. student at the School of Artificial Intelligence, Shanghai Jiao Tong University, majoring in Computer Science and Technology. I am fortunate to be advised by Academician Weinan E and Prof. Junchi Yan. Before the Ph.D. program, I received my bachelor’s degree from SJTU and was upgraded from the master’s program of the Department of Computer Science and Engineering.
My research focuses on broad machine learning methodologies, especially machine learning for combinatorial optimization and decision making, generative models, and reasoning-oriented large language models. I have published 12 first-author/co-first-author papers at CCF-A top-tier conferences, including 10 papers at NeurIPS, ICML, and ICLR, with Spotlight recognitions at NeurIPS and ICML.
I have led multiple open-source projects on machine learning for discrete optimization, contributed to Huawei’s OptVerse-related technology initiatives, and participated in Alibaba’s RL4LLM framework development. My open-source projects have received over 5,000 stars in total, and the toolkits I led have accumulated over 70k downloads. I also serve as a reviewer for top-tier ML conferences (NeurIPS, ICML, ICLR, etc.) and journals (TPAMI, etc.).
Selected Experiences
-
T-Star Talent Program Intern, Alibaba ATH Business Group (June 2025 - Feb. 2026)
Worked on reinforcement-learning-based post-training for large reasoning models and RL4LLM infrastructure. First/co-first-authored works include Attention Illuminates LLM Reasoning (ICML 2026), FlowTracer (ICML 2026), and Reasoning Palette (CVPR 2026). Contributed to Alibaba’s ALE agent learning ecosystem, ROME agent model, and ROLL Flash asynchronous RL training system. -
Researcher, ReThinkLab, Shanghai Jiao Tong University (July 2021 - Present)
Developed generative machine learning paradigms and generative combinatorial optimization solvers, including PCL (ICML 2025), T2T (NeurIPS 2023), FastT2T (NeurIPS 2024), GenSCO (NeurIPS 2025), and Unify ML4TSP (ICLR 2025). Led open-source resources includingawesome-ml4co,ML4CO-Kit, andML4TSPBench.
Academic Performance
Undergraduate period:
- GPA: 91.03/100 (or 3.93/4.3), Rank: 3/129 (top 2.3%)
- Foundation Courses: 73.33% above A, 40.00% above A+
- Subject Courses: 80.00% above A, 50.00% above A+
Postgraduate period:
- GPA: 3.83/4.0
- Rank Reference: 3 out of 211 achieving the Graduate National Scholarship
- Courses: 90.0% at A level
Selected Awards
- NSFC Youth Student Fundamental Research Program - Doctoral Fellowship (the only recipient in the school)
- CAST Young Talent Development Program - Doctoral Fellowship (the only recipient in the school)
- SJTU Pacemaker to Merit Student Award (top 10 university-wide)
- Graduate National Scholarship (top 1% in CS Dept.)
- Undergraduate National Scholarship (top 0.2% in the nation)
- Outstanding Graduate of Shanghai (top 3%)
- NeurIPS 2024 Top Reviewer Award (top 10%)
- Huawei Fellowship (top 3%)
- HyperGryph Fellowship (top 3%)
- 1st-Class Academic Excellence Scholarship (top 1%)
- Merit Student of Shanghai Jiao Tong University
- 1st-Class Academic Scholarship for Graduate Students
- Special Prize for Social Practice of SJTU
- First Prize for Social Practice of SJTU
- Advanced Individuals in Social Practice of SJTU
News
| Jun 3, 2026 | Eight papers were accepted by ICML/CVPR/ICLR 2026. |
|---|---|
| Oct 1, 2025 | Four paper was accepted by NeurIPS 2025. |
| May 1, 2025 | Two paper was accepted by ICML 2025. |
| Jan 23, 2025 | Two paper was accepted by ICLR 2025. |
| Sep 26, 2024 | Two paper was accepted by NeurIPS 2024. |
Publications
-
ICMLHow Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMsIn International Conference on Machine Learning, 2026
-
ICMLAttention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy OptimizationIn International Conference on Machine Learning, 2026
-
CVPRReasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMsIn Conference on Computer Vision and Pattern Recognition, 2026
-
ICLRMaskCO: Masked Generation Drives Effective Representation Learning and Exploiting for Combinatorial OptimizationIn International Conference on Learning Representations, 2026
-
ICLRConRep4CO: Contrastive Representation Learning of Combinatorial Optimization Instances across TypesIn International Conference on Learning Representations, 2026
-
ICLRNative Adaptive Solution Expansion for Diffusion-based Combinatorial OptimizationIn International Conference on Learning Representations, 2026
-
arXivLet It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem2026Core contributor
-
arXiv
-
NeurIPSBridging Crypto with ML-based Solvers: the SAT Formulation and BenchmarksIn Advances in Neural Information Processing Systems, 2025
-
中国科学Learning to Solve Combinatorial Optimization under Positive Linear Constraints via Non-Autoregressive Neural NetworksSCIENTIA SINICA Informationis 2024, 2024
