Hot Papers

Hot Papers Session Hosts

Hosts: Hongyu Lin

   

Hosts: Hongyu Lin
Affiliation: Institute of Software, Chinese Academy of Sciences
Bio: Hongyu Lin is an associate professor at the Institute of Software, Chinese Academy of Sciences. His research focuses on the knowledge mechanisms and post-training of large language models. In recent years, he has published over 70 papers in top-tier international journals and conferences in natural language processing and artificial intelligence, such as ACL, NeurIPS, ICLR, and AIJ. He has led and participated in numerous national and ministerial-level projects, including the National Natural Science Foundation of China (General/Youth Program), key projects of the National Natural Science Foundation of China, and the Strategic Priority Research Program (Class A) of the Chinese Academy of Sciences, as well as several industry-academia collaboration projects like the CCF-Baidu Songuo Fund and the Tencent WeChat Rhino-Bird Fund. He has received an ACL 2024 Area Chair Award, EDBT 2025 Best Paper Runner-Up, the Special Prize of the President's Award of the Chinese Academy of Sciences, the CIPS Outstanding Doctoral Dissertation Award, and the First Prize of the Qian Weichang Chinese Information Processing Science and Technology Award from the Chinese Information Processing Society of China.

Hosts: Zhuosheng Zhang

   

Hosts: Zhuosheng Zhang
Affiliation: Shanghai Jiao Tong University
Bio: Zhuosheng Zhang is a tenure-track assistant professor and doctoral supervisor at Shanghai Jiao Tong University. His research interests include natural language processing, large model agents, and security. He has been selected for the CIPS Outstanding Doctoral Dissertation Award, ACM SIGAI Outstanding Doctoral Dissertation Award, AI Chinese Young Scholars Top 100, and the World Artificial Intelligence Conference (WAIC) SAIL Award Star. He has published 80 papers in top-tier journals and conferences such as TPAMI, CSUR, ICLR, ICML, ACL, and AAAI, with 7,800 citations on Google Scholar and over 20,000 GitHub stars for his open-source contributions. He received the 2024 WAIC Youth Outstanding Paper Award, and three of his papers (ICLR/AAAI) were selected as Paper Digest's Most Influential Papers. He serves as a committee member of the CIPS Youth Working Committee and the Large Model and Generation Professional Committee, an action editor for ACL Rolling Review, a (senior) area chair for top international conferences like ACL, EMNLP, and NAACL, and a sub-committee co-chair for CCL 2022 and CCL 2024.

Hosts: Wenqiang Lei

   

Hosts: Wenqiang Lei
Affiliation: Sichuan University
Bio: Wenqiang Lei is an Assistant Dean, Professor, and doctoral supervisor at the College of Computer Science, Sichuan University. He is a national-level young talent and received his Ph.D. from the National University of Singapore. His research focuses on natural language processing, information retrieval, and human-computer interaction systems. He has led projects such as the National Key R&D Program of China and the National Natural Science Foundation of China (General Program). He has published dozens of CCF-A long papers as the first or corresponding author, with several of these papers cited over a hundred times within two years. He received the ACM MM 2020 Best Paper Award and an ACL 2024 Area Chair Award. He has given tutorials at top international conferences like ACL and SIGIR multiple times. He has served as a (senior) program committee member for major top international conferences such as ACL, KDD, AAAI, IJCAI, WSDM, and EMNLP, and was the Program Committee Chair for the Singapore National NLP Conference (SSNLP 2021) and a guest editor for the prestigious journal ACM Transactions on the Web.

Hot Papers Session

Jiasheng Si: CHECKWHY: Causal Fact Verification via Argument Structure

   

Speaker: Jiasheng Si
Affiliation: Qilu University of Technology (Shandong Academy of Sciences)
Abstract: Fact-checking is one of the effective means to combat the spread of misinformation on social networks. Current fact-checking data often focuses on atomic semantic verification. However, single events propagating on social networks tend to trigger derivative events, and the causal relationships between these cross-events have received little attention. Therefore, this paper proposes a new task of causal fact verification. By introducing the macro-argumentation structure from logic, we explicitly describe the reasoning process among multi-hop evidence and propose a corresponding dataset, CheckWhy (now open-sourced), created through a large model-human collaborative approach. Based on the four proposed tasks, this paper verifies that large models still have significant room for improvement in generating human-understandable reasoning processes.
Bio: Jiasheng Si is a lecturer and master's supervisor at Qilu University of Technology (Shandong Academy of Sciences), and a Young Taishan Scholar of Shandong Province. He received his Ph.D. from the PALM Lab at Southeast University. His research interests include fake data mining, argumentation mining, and medical large models, focusing on exploring the feasibility of large models in social tasks. He has published over 20 papers in top conferences such as ACL, KDD, and AAAI, and received an ACL 2024 Outstanding Paper Award and a Senior Area Chair Award. He has led and participated in several national and provincial-level projects and was involved in the development of the Biancang Traditional Chinese Medicine large model. He currently serves as an Area Chair for the ACL ARR series of conferences and is a member of the secretariat of the CIPS Youth Working Committee.

Le Yu: Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

   

Speaker: Le Yu
Affiliation: Alibaba, Qwen Team
Abstract: In this work, we analyze the characteristics of the chain-of-thought of large language models in reinforcement learning and find that only a small number of tokens exhibit high entropy. These tokens primarily act as "forks," determining the logical direction of the chain-of-thought. A large number of tokens exhibit low entropy, mainly completing the reasoning within the direction set by the high-entropy tokens. Reinforcement learning largely preserves the entropy characteristics of the base model (i.e., which tokens require high/low entropy) and primarily changes the entropy of the high-entropy tokens, demonstrating their importance in RL. We further trained on only the 20% high-entropy tokens while discarding the 80% low-entropy tokens. The results show that training only on high-entropy tokens significantly improves the reasoning ability and training stability of large language models, with the improvement being more pronounced for larger models, showing a certain scaling property. On the 32B model, we achieved scores of 63.5 and 56.7 on AIME'24 and AIME'25 respectively, which is the current state-of-the-art (SOTA) for models under 600B trained with reinforcement learning from a base model. Extending the maximum response length from 20k to 28k can achieve 68.1 on AIME'24. Additionally, we conducted extensive ablation studies and discuss possible explanations for some cutting-edge issues from the perspective of token entropy.
Bio: Le Yu is a Senior Algorithm Engineer at Alibaba's Qwen team. He graduated from Beihang University. His research focuses on alignment for large language models. He has published over 20 papers in top international conferences and journals, with more than 4,000 citations. He is currently engaged in post-training work for the Qwen model, including fine-tuning, distillation, and reinforcement learning.

Binghai Wang: Modeling World Preference: The Scaling Law in Preference Modeling (Reward Modeling)

   

Speaker: Binghai Wang
Affiliation: Fudan University
Abstract: Inspired by the Scaling Laws of language modeling, we discovered that preference modeling (reward modeling) also follows a Scaling Law. We verified this law through large-scale experiments on models ranging from 1.5B to 72B parameters, using 15M preference data points collected from public forums. Specifically, as the training scale and model size increase, the preference modeling loss (BT loss) decreases logarithmically. The continuous scaling of preference modeling suggests that seemingly diverse human preferences may have a unified and transcendent representation. We propose "Modeling World Preference" to emphasize this possibility of unification. The scalability of preference modeling seems unexpected, on one hand, because human preference datasets are very noisy, and on the other, because the modeling objective of the BT loss is overly simple and sparse. In this talk, we will share why preference modeling, or reward modeling in general, is scalable.
Bio: Binghai Wang is a second-year Ph.D. student at Fudan University, advised by Professor Xuanjing Huang. His research interest is in large model alignment.

Zhiwei He: DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

   

Speaker: Zhiwei He
Affiliation: Shanghai Jiao Tong University
Abstract: Although Large Language Models (LLMs) have shown great potential in solving complex reasoning problems through Reinforcement Learning (RL), their progress is limited by the scarcity of open-source, high-quality training data. To address this challenge, we have constructed and released a large-scale mathematics dataset, DeepMath-103K. This dataset has three core features:

1. Highly Challenging: The problems are primarily of high difficulty levels (5-9), designed to push the boundaries of model capabilities.

2. Highly Purified: Through a rigorous process, we have removed overlaps with numerous standard test benchmarks to ensure the reliability of evaluations.

3. Verifiable Answers: All problems come with deterministic answers that can be used for rule-based reinforcement learning rewards.

Furthermore, DeepMath-103K innovatively draws content from more diverse sources like Math StackExchange, enhancing the novelty of the data. Experiments show that models trained on this dataset not only set new state-of-the-art (SOTA) records on several authoritative math benchmarks but also successfully generalize their reasoning abilities to scientific domains such as biology, physics, and chemistry. We have open-sourced the dataset, code, and models to promote the development of the community.
Bio: Zhiwei He is a fourth-year Ph.D. student at Shanghai Jiao Tong University, advised by Associate Professor Rui Wang. His main research area is large model reasoning. He has published several papers in conferences such as ACL, ICLR, and ICML, with over 1,400 citations on Google Scholar.

Xinnong Zhang: SocioVerse: A Social Simulation World Model Driven by a Pool of Ten Million Real Users and Language Model Agents

   

Speaker: Xinnong Zhang
Affiliation: Fudan University
Abstract: Social simulation is transforming traditional social science research by simulating interactions between virtual individuals and their environment. With the rapid development of Large Language Models (LLMs), this approach shows increasing potential in portraying individual differences and predicting group behaviors. However, existing methods still face consistency challenges in environmental settings, user goals, interaction mechanisms, and behavioral patterns. To address this, we propose SocioVerse, a social simulation world model driven by LLM agents. The framework includes four powerful alignment modules and constructs a user pool of ten million real users. We conducted large-scale simulation experiments in three domains: politics, news, and economics. The results show that SocioVerse can effectively reflect the dynamics of large populations while ensuring diversity, credibility, and representativeness with standardized processes and minimal manual intervention.
Bio: Xinnong Zhang is a second-year Ph.D. student at Fudan University, advised by Associate Professor Zhongyu Wei. His main research interests are computational social science and agent-based social simulation driven by language models. He has published several related works in natural language processing conferences such as ACL, EMNLP, and NAACL.

Yanxing Huang: AIM: An AI Mathematician Agent System

   

Speaker: Yanxing Huang
Affiliation: Tsinghua University
Abstract: The mathematical capabilities of language models have advanced rapidly with the popularization and development of reasoning model technologies. Their performance on many cutting-edge mathematical problems is now approaching that of top graduate students in mathematics. Consequently, using language model agent systems to automate mathematical research and exploration is gradually becoming feasible. In this work, we have made preliminary explorations in this direction and have achieved promising results in our experiments.
Bio: Yanxing Huang is an incoming first-year Ph.D. student at Tsinghua University, advised by Professor Yang Liu. His main research area is AI4Math, dedicated to exploring the potential of using AI technology to assist in automated mathematical research. He has already produced several works in this field.

Xinyuan Zhu: Hierarchical progressive learning for zero-shot peptide-HLA binding prediction and automated antigenic peptide design

   

Speaker: Xinyuan Zhu
Affiliation: University of Science and Technology of China
Abstract: High-affinity binding of peptides to Human Leukocyte Antigen (HLA) molecules is key to initiating an adaptive immune response. Therefore, accurately predicting peptide-HLA (pHLA) binding affinity has significant theoretical and practical value for cancer immunotherapy and understanding the mechanisms of autoimmune diseases. Existing methods struggle to accurately predict peptide binding for zero-shot HLA types, limiting the application of current tools to a broader population, especially in precision medicine scenarios where individual HLA diversity must be considered. We propose an innovative Hierarchical Progressive Learning (HPL) framework. Through a multi-level, progressive learning strategy utilizing protein language models, this framework effectively captures and leverages shared and specific sequence patterns and binding specificities among different HLA types, significantly improving prediction performance.
Bio: Xinyuan Zhu is a Ph.D. student at the University of Science and Technology of China, advised by Professor Xiangnan He and Professor Fuli Feng. His main research area is AI for Life Science, including protein language models and their applications, and protein structure prediction.

Chen Huang: How to Enable Effective Cooperation Between Humans and LLM: A Survey of Principles, Formalizations, and Beyond

   

Speaker: Chen Huang
Affiliation: National University of Singapore
Abstract: With the advancement of large models, agents have evolved from mere tools to autonomous agents with their own goals and strategies, capable of collaborating with humans. This evolution has given rise to a new paradigm in the field of natural language processing: human-computer cooperation. In recent years, human-computer cooperation has made significant progress in numerous NLP tasks. This presentation will provide a comprehensive review of human-computer cooperation, exploring its principles, formalizations, and unresolved challenges, aiming to pave the way for more groundbreaking research in this area.
Bio: Chen Huang is a postdoctoral fellow at the National University of Singapore. He will receive his Ph.D. in Computer Science from Sichuan University in 2025. His main research interests are conversational AI and large models. He has published multiple academic papers in top international conferences such as ACL, WWW, AAAI, EMNLP, FSE, COLING, NAACL, and ICDM.