Yang Li (郦洋)

Ph.D. Student @ ReThinkLab, School of AI, SJTU.

prof_pic.jpg

Mail: yanglily@sjtu.edu.cn

City: Shanghai, 200240

About

I am a Ph.D. student at the School of Artificial Intelligence, Shanghai Jiao Tong University, majoring in Computer Science and Technology. I am fortunate to be advised by Academician Weinan E and Prof. Junchi Yan. Before the Ph.D. program, I received my bachelor’s degree from SJTU and was upgraded from the master’s program of the Department of Computer Science and Engineering.

My research focuses on broad machine learning methodologies, especially machine learning for combinatorial optimization and decision making, generative models, and reasoning-oriented large language models. I have published 12 first-author/co-first-author papers at CCF-A top-tier conferences, including 10 papers at NeurIPS, ICML, and ICLR, with Spotlight recognitions at NeurIPS and ICML.

I have led multiple open-source projects on machine learning for discrete optimization, contributed to Huawei’s OptVerse-related technology initiatives, and participated in Alibaba’s RL4LLM framework development. My open-source projects have received over 5,000 stars in total, and the toolkits I led have accumulated over 70k downloads. I also serve as a reviewer for top-tier ML conferences (NeurIPS, ICML, ICLR, etc.) and journals (TPAMI, etc.).

Selected Experiences

  • T-Star Talent Program Intern, Alibaba ATH Business Group (June 2025 - Feb. 2026)
    Worked on reinforcement-learning-based post-training for large reasoning models and RL4LLM infrastructure. First/co-first-authored works include Attention Illuminates LLM Reasoning (ICML 2026), FlowTracer (ICML 2026), and Reasoning Palette (CVPR 2026). Contributed to Alibaba’s ALE agent learning ecosystem, ROME agent model, and ROLL Flash asynchronous RL training system.

  • Researcher, ReThinkLab, Shanghai Jiao Tong University (July 2021 - Present)
    Developed generative machine learning paradigms and generative combinatorial optimization solvers, including PCL (ICML 2025), T2T (NeurIPS 2023), FastT2T (NeurIPS 2024), GenSCO (NeurIPS 2025), and Unify ML4TSP (ICLR 2025). Led open-source resources including awesome-ml4co, ML4CO-Kit, and ML4TSPBench.

Academic Performance

Undergraduate period:

  • GPA: 91.03/100 (or 3.93/4.3), Rank: 3/129 (top 2.3%)
  • Foundation Courses: 73.33% above A, 40.00% above A+
  • Subject Courses: 80.00% above A, 50.00% above A+

Postgraduate period:

  • GPA: 3.83/4.0
  • Rank Reference: 3 out of 211 achieving the Graduate National Scholarship
  • Courses: 90.0% at A level

Selected Awards

  • NSFC Youth Student Fundamental Research Program - Doctoral Fellowship (the only recipient in the school)
  • CAST Young Talent Development Program - Doctoral Fellowship (the only recipient in the school)
  • SJTU Pacemaker to Merit Student Award (top 10 university-wide)
  • Graduate National Scholarship (top 1% in CS Dept.)
  • Undergraduate National Scholarship (top 0.2% in the nation)
  • Outstanding Graduate of Shanghai (top 3%)
  • NeurIPS 2024 Top Reviewer Award (top 10%)
  • Huawei Fellowship (top 3%)
  • HyperGryph Fellowship (top 3%)
  • 1st-Class Academic Excellence Scholarship (top 1%)
  • Merit Student of Shanghai Jiao Tong University
  • 1st-Class Academic Scholarship for Graduate Students
  • Special Prize for Social Practice of SJTU
  • First Prize for Social Practice of SJTU
  • Advanced Individuals in Social Practice of SJTU

News

Jun 3, 2026 Eight papers were accepted by ICML/CVPR/ICLR 2026.
Oct 1, 2025 Four paper was accepted by NeurIPS 2025.
May 1, 2025 Two paper was accepted by ICML 2025.
Jan 23, 2025 Two paper was accepted by ICLR 2025.
Sep 26, 2024 Two paper was accepted by NeurIPS 2024.

Publications

  1. ICML
    How Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMs
    Zhichen Dong*, Yang Li*, Yuhan Sun, and 7 more authors
    In International Conference on Machine Learning, 2026
  2. ICML
    Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization
    Yang Li, Zhichen Dong, Yuhan Sun, and 9 more authors
    In International Conference on Machine Learning, 2026
  3. CVPR
    Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
    Rujiao Long*, Yang Li*, Xingyao Zhang, and 7 more authors
    In Conference on Computer Vision and Pattern Recognition, 2026
  4. ICLR
    MaskCO: Masked Generation Drives Effective Representation Learning and Exploiting for Combinatorial Optimization
    Lvda Chen*, Yang Li*, and Junchi Yan
    In International Conference on Learning Representations, 2026
  5. ICLR
    ConRep4CO: Contrastive Representation Learning of Combinatorial Optimization Instances across Types
    Ziao Guo, Yang Li, Shiyue Wang, and 1 more author
    In International Conference on Learning Representations, 2026
  6. ICLR
    Native Adaptive Solution Expansion for Diffusion-based Combinatorial Optimization
    Yu Wang, Yang Li, Jiale Ma, and 2 more authors
    In International Conference on Learning Representations, 2026
  7. arXiv
    Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
    ROCK and ROLL and IFLOW and DT Joint Team
    2026
    Core contributor
  8. arXiv
    Part II: ROLL Flash–Accelerating RLVR and Agentic Training with Asynchrony
    Han Lu, Zichen Liu, Shaopan Xiong, and 19 more authors
    2025
  9. NeurIPS
    Bridging Crypto with ML-based Solvers: the SAT Formulation and Benchmarks
    Xinhao Zheng, Xinhao Song, Bolin Qiu, and 3 more authors
    In Advances in Neural Information Processing Systems, 2025
  10. NeurIPS
    Generation as Search Operator for Test-Time Scaling of Diffusion-based Combinatorial Optimization
    Yang Li, Lvda Chen, Haonan Wang, and 2 more authors
    In Advances in Neural Information Processing Systems, 2025
  11. NeurIPS
    StruDiCO: Structured Denoising Diffusion with Gradient-free Inference-stage Boosting for Memory and Time Efficient Combinatorial Optimization
    Yu Wang, Yang Li, Junchi Yan, and 1 more author
    In Advances in Neural Information Processing Systems, 2025
  12. NeurIPS
    ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs
    Jiale Ma, Wenzheng Pan, Yang Li, and 1 more author
    In The Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2025
  13. ICML
    Generative Modeling Reinvents Supervised Learning: Label Repurposing with Predictive Consistency Learning
    Yang Li, Jiale Ma, Yebin Yang, and 3 more authors
    In International Conference on Machine Learning, 2025
  14. ICML
    COExpander: Adaptive Solution Expansion for Combinatorial Optimization
    Jiale Ma, Wenzheng Pan, Yang Li, and 1 more author
    In International Conference on Machine Learning, 2025
  15. ICLR
    Unify ML4TSP: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search
    Yang Li, Jiale Ma, Wenzheng Pan, and 4 more authors
    In International Conference on Learning Representations, 2025
  16. ICLR
    UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP
    Wenzheng Pan, Hao Xiong, Jiale Ma, and 3 more authors
    In International Conference on Learning Representations, 2025
  17. NeurIPS
    Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization
    Yang Li*, Jinpei Guo*, Runzhong Wang, and 2 more authors
    In Advances in Neural Information Processing Systems, 2024
  18. NeurIPS
    Learning Plaintext-Ciphertext Cryptographic Problems via ANF-based SAT Instance Representation
    Xinhao Zheng, Yang Li, Cunxin Fan, and 3 more authors
    In Advances in Neural Information Processing Systems, 2024
  19. NeurIPS
    Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime
    Haoyu Geng, Hang Ruan, Runzhong Wang, and 4 more authors
    In Advances in Neural Information Processing Systems, 2024
  20. 中国科学
    Learning to Solve Combinatorial Optimization under Positive Linear Constraints via Non-Autoregressive Neural Networks
    Runzhong Wang, Yang Li, Junchi Yan, and 1 more author
    SCIENTIA SINICA Informationis 2024, 2024
  21. ICMLSpotlight (3.5%)
    ACM-MILP: Adaptive Constraint Modification via Grouping and Selection for Hardness-Preserving MILP Instance Generation
    Ziao Guo, Yang Li, Chang Liu, and 2 more authors
    In The Forty-first International Conference on Machine Learning, 2024
  22. ICLR
    MixSATGEN: Learning Graph Mixing for SAT Instance Generation
    Xinyan Chen*, Yang Li*, Runzhong Wang, and 1 more author
    In The Twelfth International Conference on Learning Representations, 2024
  23. arXiv
    Machine Learning Insides OptVerse AI Solver: Design Principles and Applications
    Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, and 23 more authors
    arXiv preprint, 2024
  24. arXiv
    Molecule Generation for Drug Design: a Graph Learning Perspective
    Nianzu Yang, Huaijin Wu, Kaipeng Zeng, and 2 more authors
    In Fundamental Research, 2024
  25. NeurIPS
    T2T: From Distribution Learning in Training to Gradient Search in Testing for Combinatorial Optimization
    Yang Li, Jinpei Guo, Runzhong Wang, and 1 more author
    In Advances in Neural Information Processing Systems, 2023
  26. SIGKDD
    HardSATGEN: Understanding the Difficulty of Hard SAT Formula Generation and A Strong Structure-Hardness-Aware Baseline
    Yang Li, Xinyan Chen, Wenxuan Guo, and 6 more authors
    In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023
  27. IJCAI
    IID-GAN: an IID Sampling Perspective for Regularizing Mode Collapse
    Yang Li*, Liangliang Shi*, and Junchi Yan
    In Proceedings of the 32nd International Joint Conference on Artificial Intelligence, 2023
  28. NeurIPSSpotlight (5%)
    Improving Generative Adversarial Networks via Adversarial Learning in Latent Space
    Yang Li, Yichuan Mo, Liangliang Shi, and 1 more author
    In Advances in Neural Information Processing Systems, 2022