CCL 2019特邀报告




报告人:何晓冬(京东AI研究院常务副院长、IEEE Fellow)





简介:何晓冬博士是京东人工智能研究院常务副院长,深度学习及语音和语言实验室的负责人。他还在华盛顿大学(西雅图)、香港中文大学(深圳)、同济大学、及中央美术学院任兼职教授和荣誉教授。在加入京东集团之前,他曾担任微软雷德蒙德研究院深度学习技术中心的首席研究员和负责人。他的研究主要集中在人工智能领域,包括深度学习,自然语言处理,语音识别,计算机视觉,信息检索和多模态智能。他与合作者在这些领域发表了100多篇论文,谷歌学术统计引用数超过13000次,并多次获得优秀论文奖及赢得重要的人工智能方面大赛。他与合作者发明的深层结构化语义模型(DSSM/C-DSSM),分层注意力网络(HAN),CaptionBot,SAN,AttnGAN,BUTD Attention等广泛应用于语言,视觉,IR和人机对话等任务。基于其在自然语言和视觉技术及多模态信息处理方面的贡献,他于2018年入选IEEE Fellow。







摘要:理解和利用数据是信息技术发展的迫切需求,数据可视化为人类洞察数据的内涵、理解数据蕴藏的规律提供了重要的手段和高效的人机界面,是和数据分析、数据挖掘等方法的有效补充,在一些重要场合将起到不可替代的作用。本次报告将介绍面向宋文化数据(宋词、宋人年谱、哈佛大学CBDB数据库)的可视化作品与系统,以技术、设计、文化三者角度介绍课题组在面向唐诗宋词文化大数据的可视化和可视分析的作品和软件。所介绍的系统支持人文研究者从人物的生平轨迹和不同生活年代的背景进行分析与对比, 探索不同年代、不同经历的文化作品在文本主题上的相关性与独特性. 并通过对文本的可视化与文本意象的关联分析,支持以多维度的视角去了解人物生平、关系以及时代背景。




报告人:张晓东(尼克) (《人工智能简史》作者、乌镇智库理事长)












摘要:Recent advances of deep learning have been successful in delivering the state-of-the-art performance in medical image analysis, including lesion segmentation and classification. However, deep neural networks (DNNs) require a large amount of training data with high-quality annotation which are not available or expensive in the field of medical images. Moreover, black box deep learning algorithms are lack of interpretability and limit their application in medical diagnostics.
           In this talk, a series of our research will be introduced including 1) a data-efficient approach that integrates domain knowledge as a strong prior into deep learning framework for lesion segmentation task. 2) A boosting strategy on generating representative data samples for efficient training where training dataset is expanded and updated based on the current state of DNN in an active manner.

个人简介:英国皇家工程院院士、欧洲科学院院士、英国帝国理工大学教授、香港浸会大学副校长。帝国理工学院Data Mining Research Group及 Discovery Sciences Group的领导者,伦敦E-Science研究中心首席科学家,英国InforSense有限公司董事会主席兼首席执行官,上海生物信息技术研究中心客座教授兼首席科学家。郭教授在云计算、数据挖掘、生物信息学方面的研究处于全球领先位置。1985年本科毕业于清华大学计算机系,1986年硕士毕业于清华大学计算机系,1993年博士毕业于帝国理工大学计算机系,留校工作5 年后就取得了帝国理工计算机系教授的职位。郭毅可教授是清华大学计算机系IV-VENTURE客座教授,上海市首批千人计划入选者,也是中国计算机学会大数据专家委员会首批委员。






题目:How to Write a History Book?

摘要:Understanding events and communicating about them are fundamental human activities. However, it's much more difficult to remember event-related information compared to entity-related information. For example, most people in China will be able to answer the question "Which city is Tsinghua University located in?", but very few people can give a complete answer to "Who died in 1937 Nanjing Massacre?". Human-written history books are often incomplete and highly biased because "History is written by the victors". For example, the history textbooks used in Japanese schools barely mention details about the Nanjing Massacre, and even so contain a lot of false information. In this talk I will describe an ambitious ongoing project on automatic history book generation. We propose a new research direction on event-centric knowledge base construction from multimedia multilingual sources. Our minds represent events at various levels of granularity and abstraction, which allows us to quickly access and reason about old and new scenarios. Progress in natural language understanding and computer vision has helped automate some parts of event understanding but the current, first-generation, automated event understanding is overly simplistic since it is local, sequential and flat. Real events are hierarchical and probabilistic. Understanding them requires knowledge in the form of a repository of abstracted event schemas (complex event templates), understanding the progress of time, using background knowledge, and performing global inference. Our approach to second-generation event understanding builds on an incidental supervision approach to inducing an event schema repository that is probabilistic, hierarchically organized and semantically coherent. Low level primitive components of event schemas are abundant, and can be part of multiple, sparsely occurring, higher-level schemas. Consequently, we combine bottom-up data driven approaches across multiple modalities with top-down consolidation of information extracted from a smaller number of encyclopedic resources. This facilitates inducing higher-level event representations analysts can interact with, and allow them to guide further reasoning and extract events by constructing a novel structured cross-media common semantic space. When complex events unfold in an emergent and dynamic manner, the multimedia multilingual digital data from traditional news media and social media often convey conflicting information. To understand the many facets of such complex, dynamic situations, we have also developed cross-media cross-document event coreference resolution methods for information verification and disinformation detection. We then extract event-event relations and apply knowledge-driven natural language generation techniques to write event-centric history book chapters.

简历:Heng Ji is a professor at Computer Science Department of University of Illinois at Urbana-Champaign. She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction and Knowledge Base Population. She is selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. The awards she received include "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013 and NSF CAREER award in 2009. She has coordinated the NIST TAC Knowledge Base Population task since 2010. She is the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing, and served as the Program Committee Co-Chair of many conferences including NAACL-HLT2018.

CCL 2019特邀报告PPT下载列表

序号 报告题目 报告人 PPT下载
语言与视觉多模态智能的进展 何晓冬 语言与视觉多模态智能的进展.pdf
数据高效性机器学习 郭毅可 数据高效性机器学习.pdf