Hot Papers

Hot Paper 1: Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

Speaker: Lean Wang, PhD Student (Peking University)
Title: Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Paper Introduction: This paper examines in-context learning from the perspective of information flow, proposing and validating the hypothesis that "label words serve as anchors in in-context learning." According to this hypothesis, at the shallow layers of the model, large language models (LLMs) aggregate information from example texts to the corresponding example labels, while at the deeper layers, LLMs further extract information from these labels to make the final prediction. The paper designs experiments measuring saliency, blocking attention, and correlating attention magnitude with classification results to validate this hypothesis. Based on this hypothesis, the paper proposes three potential applications: Anchor Re-Weighting, Anchor-Only Context Compression, and Anchor Distances for Error Diagnosis. These applications demonstrate the potential of the analysis, improving the performance and efficiency of in-context learning and partially explaining errors observed in in-context learning. The paper won the Best Long Paper Award at EMNLP 2023.
Presenter Introduction: Lean Wang, a PhD student at Peking University, Class of 2023. His research interests include the interpretability of large language models, downstream applications, and model architecture improvements.

Hot Paper 2: MiniCPM-Llama3-V2.5: A GPT-4V Level Multimodal LLM on Your Phone

Speaker: Yuan Yao, Postdoctoral Fellow (National University of Singapore)
Title: MiniCPM-Llama3-V2.5: A GPT-4V Level Multimodal LLM on Your Phone
Paper Introduction: The rapid development of multimodal large models has become a research hotspot in the field of artificial intelligence, but practical development still faces many challenges, such as hallucinations, weak multilingual capabilities, limited visual resolution, and high costs. This paper proposes the MiniCPM-V series of efficient edge-side multimodal large models, focusing on enhancing model usability with leading performance, reliable behavior, multilingual support, high-definition encoding, and computational efficiency. This report will introduce the core technologies and application effects of MiniCPM-V, including: (1) supporting multiple languages through cross-language generalization technology; (2) reducing model hallucinations through multimodal preference learning technology; and (3) improving performance through efficient image encoding technology, supporting high-resolution image encoding at any resolution. MiniCPM-Llama3-V2.5 ranked first on HuggingFace Trending for a week and topped GitHub Trending and Papers With Code Trending Research. Since its release in February 2024, the MiniCPM-V series has received over 8,000 stars on GitHub and has been downloaded over 600,000 times, receiving positive feedback from the open-source community.
Presenter Introduction: Yuan Yao, Postdoctoral Fellow at the National University of Singapore. His research interests include multimodal large models and natural language processing.

Hot Paper 3: OneBit: Towards Extremely Low-bit Large Language Models

Speaker: Yuzhuang Xu, PhD Student (Harbin Institute of Technology)
Title: OneBit: Towards Extremely Low-bit Large Language Models
Abstract: Since the advent of ChatGPT, the powerful capabilities of large language models (LLMs) have left a deep impression. However, the computational resources required by LLMs are demanding, posing significant challenges for deployment and use. Researchers hope to compress the size or computational load of models through quantization (representing existing LLM parameters or intermediate computational results with low-bit widths), thereby enabling deployment and use on platforms with general computational resources. Previous research has shown that model weights can be quantized to 2 bits with minimal performance loss. This paper (OneBit) innovatively proposes a method for representing model parameters in 1-bit and compresses the model through quantization-aware knowledge distillation, achieving 1-bit quantization of LLM weights for the first time. Additionally, the paper discusses the capabilities of ultra-low-bit quantized models. Since its release, the OneBit paper has garnered widespread attention and discussion in both academia and industry, with the authors being invited to present their research at several well-known forums and media outlets. This report will introduce the 1-bit model parameter representation method, the capability loss of ultra-low-bit quantized models, and share technical insights from the research process.
Presenter Introduction: Yuzhuang Xu, a PhD student at the SCIR Lab, Harbin Institute of Technology, Class of 2024. His research interests include efficient deployment of large language models, intelligent agents based on large language models, and multilingual processing.

Hot Paper 4: Benchmarking Large Language Models in Retrieval-Augmented Generation

Speaker: Jiawei Chen, PhD Student (Institute of Software, Chinese Academy of Sciences)
Title: Benchmarking Large Language Models in Retrieval-Augmented Generation
Paper Introduction: Retrieval-Augmented Generation (RAG) is an effective method to mitigate hallucinations in large language models (LLMs). This paper systematically studies the impact of retrieval-augmented generation on LLMs, establishing an evaluation system for large models' retrieval-augmented generation capabilities. The paper proposes four critical abilities for evaluating large models' retrieval-augmented generation: noise robustness, rejection, information integration, and counterfactual robustness. To this end, the paper constructs the Retrieval-Augmented Generation Benchmark (RGB) for evaluating RAG capabilities of English and Chinese large language models. RGB divides instances in the benchmark into four separate test sets based on the four abilities and is constructed from contemporary news corpora. The paper evaluates six representative large models on RGB, analyzing the challenges they face when applying RAG. The results show that while LLMs exhibit a certain degree of noise robustness, they still face significant difficulties in rejection, information integration, and handling misinformation. The evaluation results indicate that effectively applying retrieval-augmented generation to large models still has a long way to go. The paper was published at AAAI 2024, and the constructed RGB benchmark has been adopted by large models such as Alibaba Tongyi Qianwen for RAG evaluation.
Presenter Introduction: Jiawei Chen, a PhD student at the Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Class of 2019. His research interests include pre-training, alignment of large language models, and retrieval-augmented generation.

Hot Paper 5: AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Speaker: Zhiheng Xi, Master's to PhD Program Student (Fudan University)
Title: AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper Introduction: In this work, we take the first step toward building general large language model (LLM)-based agents capable of self-evolving across diverse environments and tasks. We identified three key elements needed to achieve this goal: 1) diverse environments for agents to explore and learn; 2) a set of demonstration trajectories to provide basic abilities and prior knowledge for agents; and 3) an effective and scalable evolutionary method. Based on these elements, we propose AgentGym, an interactive framework integrating various environments and tasks for extensive, real-time, unified-format, and concurrent agent exploration. AgentGym also comes with a database containing extensive expanded instructions, benchmark suites, and high-quality trajectories (AgentTraj) collected across different environments. Additionally, we developed a new method, AgentEvol, aimed at exploring the self-evolution potential of agents across different environments and tasks.
Presenter Introduction: Zhiheng Xi, a Master's to PhD Program student at the Natural Language Processing Lab, Fudan University, Class of 2022. His research focuses on large model agents, large model reasoning, and language model robustness. He has published several papers as the first author at international conferences such as ICML, ACL, and EMNLP.

热点论文6：CCL 2024评测热点报告

Speaker: Professor Hongye Tan (Shanxi University)
Title: CCL 2024 Evaluation Hotspot Report
Paper Introduction: The CCL 2024 evaluation track focuses on cutting-edge technologies, practical applications, and social service applications in the field of natural language processing (NLP). It introduces 10 evaluation tasks, concentrating on semantic analysis, intelligent processing of ancient texts, writing instruction, commonsense reasoning, and multimodal understanding. These evaluations aim to promote the development of NLP technology, foster academic exchanges, validate and disseminate research findings, and cultivate talent. This evaluation activity has garnered widespread attention and active participation from both industry and academia, with total prize money amounting to 84,000 yuan. A total of 2,197 teams from various research institutions and enterprises registered enthusiastically. After fierce competition, 43 teams stood out and won awards, and 45 high-quality academic papers were officially included. The successful organization of this evaluation track has significantly enhanced the influence of the China Conference on Computational Linguistics, injecting strong momentum into the innovation, application, and international academic exchange of natural language processing technology in China. We believe that such evaluation activities will further advance the research in the field of natural language processing in China to higher levels and broader scopes.
Presenter Introduction: Hongye Tan is a professor and doctoral supervisor at the School of Computer and Information Technology, Shanxi University. She is a member of the Language and Knowledge Computing Committee and the Medical Health and Bioinformatics Processing Committee of the Chinese Information Processing Society of China. Her main research interests include natural language processing. She has presided over three projects of the National Natural Science Foundation of China and participated in several national projects, including the New Generation Artificial Intelligence Major Project, the National Key R&D Program, the 863 Program, and Key Projects of the National Natural Science Foundation of China. As a key member, she has written a monograph and has received one first prize and one second prize for scientific and technological progress in Shanxi Province, as well as a special prize for teaching achievements in Shanxi Province. She has been awarded the title of Teaching Master by the Shanxi Provincial Professors Association.