题目：Language and the Computer, a personal journey (ppt下载)
Language in its spoken form was invented over 50,000 years ago. Around that time, modern humans left their African homeland and colonized the entire planet. Writing was invented some 6,000 years ago, while computers were invented less than 100 years ago. As a student of language, I became aware of how the computer can help my research in the 1960s when the computer industry was in its infancy. I will recount my various uses of the computer in linguistics over the half century, in building data-bases, in exploring theories of language change, in recognizing speech and in translating between languages, and currently in analyzing brain waves to detect cognitive impairment. I will also briefly compare computer intelligence with human intelligence from a personal perspective.
Professor Wang was appointed Full Professor at the University of California at Berkeley in 1966, and taught there thirty years before returning to China. He taught at the City University of Hong Kong and at the Chinese University of Hong Kong, both appointments in the Department of Electronic Engineering, before joining the Polytechnic University of Hong Kong in 2015, as Chair Professor of Language and Cognitive Sciences in the Department of Chinese and Bilingual Studies.
He was elected Inaugural President of the International Association of Chinese Linguistics, and Academician of Academia Sinica of Taiwan. He is Honorary Professor at several universities, including Peking University, Beijing Language and Culture University, and the Chinese University of Hong Kong. The other honors he has received include fellowships from the Fulbright Commission in Washington DC, the Guggenheim Foundation in New York City; at Bellagio, Italy; at Hyderabad, India; at Kyoto, Japan; as well as twice at the Center for Advanced Study at Stanford, USA.
Professor Wang has lectured widely: at several national universities in Sweden, at Pavia in Italy, at Osmania in India, at Doshisha University in Japan, at the College de France in Paris, and at the Institute for Advanced Studies in Princeton. His publications have appeared in Chinese, English, French, German, Italian, and Japanese.
He is a strong believer in multidisciplinary research, having co-authored important papers with anthropologists, computer scientists, geneticists, mathematicians, and psychologists. His central interests are in language and the brain, the critical roles these play in the evolution of our species as well as in our daily lives.
他曾獲選國際中國語言學會(International Association of Chinese Linguistics, IACL)首任會長，亦為台灣中央研究院院士。他是北京大學、北京語言文化大學、香港中文大學等校的榮譽教授。其他榮譽包括：華府傅爾布萊特研究獎助、紐約市古根漢基金會研究獎助、義大利貝拉喬(Bellagio)高等研究中心獎助、印度海德拉巴 (Hyderabad)獎助、日本京都高等研究院獎助，並兩次獲得加州史丹福行為科學高等研究中心獎助。
题目: An Overview of Spoken Language Understanding (ppt下载)
With recent advances in machine learning, big data, and computing infrastructure, computers will realistically reach human parity in understanding spoken language in the next few years. The computing industry’s progress in spoken language processing is apparent in Microsoft products and services, including Cortana, Skype Translator, and Project Oxford cloud services. Dr. Xuedong Huang will use these examples to illustrate not only our society’s historical collective efforts in spoken language processing, but also enumerate our remaining challenges to reaching human parity, an accomplishment that will have a profound impact on society.
Dr. Xuedong Huang is a Distinguished Engineer/Chief Scientist of Speech R&D at Microsoft Corporation. He heads up the Advanced Technology Group within Microsoft’s Technology and Research organization. In 1993, Dr. Huang joined Microsoft to found the company's speech recognition efforts. As the head of Microsoft's spoken language initiatives for more than a decade, he provided technical, engineering and business leadership to bring speech recognition to the mass market. Dr. Huang’s accomplishments include introducing the first Windows operating systems speech API in 1995, and helping to ship enterprise-grade Speech Server 2004. Before assuming his current role at Microsoft, Dr. Huang spent five as the chief architect working to improve web search relevance of the company’s Bing search engine.
题目：A Study of Machine Reading – Solving Math Word Problems in a Human Comprehensive Way (ppt下载)
Since Big Data mainly aims to explore the correlation between surface features but not their underlying causality relationship, the Big Mechanism program was initiated by DARPA (USA) to find out “why” behind the “Big Data”. However, the pre-requisite for it is that the machine can read each document and learn its associated knowledge, which is the task of Machine Reading (MR). As a domain-independent MR system is complicated and difficult to build, the Math Word Problem (MWP) is frequently chosen as the first task to study MR for the following reasons: (1) Since the answer of MWP cannot be directly extracted from the given text, solving MWP can explicitly show the system capability for understanding and reasoning. (2) MWP usually possesses less complicated syntax and requires less amount of domain knowledge. It thus can let the researcher focus more on the task of understanding and reasoning, not on how to build a wide-coverage grammar and acquire domain knowledge. (3) The body part of MWP (which mentions the given information for solving the problem) usually consists of only a few sentences. Therefore, the understanding and reasoning procedure could be checked more efficiently. (4) The MWP solver could have its own standalone applications (such as Computer Tutor, etc.). It is not just a toy test case. Due to the above reasons we also choose MWP as the first task to study the MR problem.
In this talk, a proposed tag-based statistical framework for solving math word problems (with understanding and reasoning) will be introduced. Under this framework, each sentence in the MWP (including both body text and question text) is first transformed into its corresponding Semantic Representation (SR) tree by a Language Analysis module. The sequence of SR trees is then sent to the Solution Type Classifier to find out the associated solution type. Afterwards, the Logic Form Convert (LFC) maps the given SR tree into its corresponding math concepts and operations according to the assigned solution type, and represents them as First-Order-Logic (FOL) predicates/functions. Subsequently, according to pre-specified inference rules, the Inference Engine (IE) derives new facts from the old ones provided by the LFC and find out the answer. Finally, the Explanation Generation module will explain how the answer is obtained (in natural language text) according to the given reasoning chain (generated by IE).
Since the answer is obtained via understanding and inference, the proposed approach is able to: (1) provide the answer more precisely, (2) make the system less sensitive to the irrelevant information, and (3) provide the flexibility for handling various possible questions. Furthermore, the proposed approach could automatically learn patterns/parameters from the training-set via performing weakly supervised learning with a proposed statistical model.
Dr. Su, B.S., National Tsing-Hua University, Taiwan, and Ph.D. (in EE) from University of Washington, Seattle, 1984, taught after graduation at his alma mater in Taiwan. In 1985, he launched an English-Chinese Machine Translation Project. He then founded Behavior Design Corp. in 1988 to commercialize the above project. From 1989 to 1990, he was a visiting scientist at AT&T Bell Laboratories, NJ, working on speech recognition. Dr. Su left the National Tsing-Hua university (as a professor) to join Behavior Design Corporation in 1998, and has been directing the company until May, 2014. He has become a Research Fellow at the Institute of Information Science, Academia Sinica, Taiwan since June, 2014.
Dr. Su has served as the program chair/co-chair for various international conferences, the general chair of ACL-IJCNLP-2009, and an editorial board member for several international journals (including Journal of Computational Linguistics). Besides, he proposed and co-launched the Association for Computational Linguistics and Chinese Language Processing (ACLCLP, Taiwan; http://www.aclclp.org.tw/) in 1988, and has been the president and the advisory board member for that association. Also, he co-launched the Asian Federation of NLP Associations (AFNLP; http://www.afnlp.org/) at 2003, and has been the vice president (2009-2010), and the president (2011-2012) of AFNLP.
题目: Short Text Understanding: A Database Approach (ppt下载)
Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain sufficient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better understand short texts. In this work, we use lexical-semantic knowledge provided by a well-known semantic network for short text understanding. Our knowledge-intensive approach disrupts traditional methods for tasks such as text segmentation, part-of-speech tagging, and concept labeling, in the sense that we focus on semantics in all these tasks. We conduct a comprehensive performance evaluation on real-life data. The results show that knowledge is indispensable for short text understanding, and our knowledge-intensive approaches are effective in harvesting semantics of short texts. We also extend our work to handle entity linking in microblogs.
澳大利亚昆士兰大学计算机科学教授，苏州大学国家“千人计划”特聘教授，昆大数据与知识工程研究室主任，苏州大学先进数据分析研究中心主任，“863”主题项目“—开放环境下海量web数据提取集成分析和管理系统平台与应用”首席科学家。周晓方教授长期从事数据库系统和信息系统研究，主要研究领域包括空间数据库，多媒体数据库，数据质量及高性能数据处理，现任IEEE数据工程技术委员会（TCDE）主席，曾任VLDB Journal, IEEE Transactions on Knowledge and Data Engineering， IEEE Transactions on Cloud Computing和World Wide Web Journal等学术期刊编委， ICDE和CIKM等国际会议程序委员会主席。其论文“Short Text Understanding Through Lexical-Semantic Analysis”获ICDE 2015最佳论文奖。
特邀报告3： Antony John Kunnan教授
报告人: Antony John Kunnan（广东外语外贸大学）
题目：Automated essay evaluation and feedback systems: Are they useful for ESL test takers and ESL teachers? (ppt下载)
Automated essay scoring (AES) has become increasingly popular in the last decade with many assessment agencies developing and promoting their automated scoring and automated feedback systems. Ware (2011, p. 769) defines these two aspects of AES as “the provision of automated scores derived from mathematical models built on organizational, syntactic, and mechanical aspects of writing” and automated feedback as “computer tools for writing assistance rather than for writing assessment.” While scoring engines that score essays are being improved with the addition of salient writing features (such as lexical diversity), diagnostic feedback systems are not well equipped to identify off-topic or near off-topic essays and less conventional and less structurally controlled essays. These two issues are particularly important when scoring essays from English-as-a-second-language (ESL) writers.
Findings from two empirical studies (Hoang and Kunnan, 2016; Liu and Kunnan, 2015) on automated essay scoring conducted with data from Vietnamese and Chinese ESL students using MyAccess and WritetoLearn respectively will be presented. The results presented will include comparisons of human and automated scoring and human and automated diagnostic feedback. Additional findings from other researchers will be presented; these findings (Bridgeman et al., 2012; Elliot and Klobucar, 2013; Weigle, 2013; ) call into question the relationship between instruction and learning in automated scoring and feedback contexts and the critical problem of technological determinism in ESL essay writing. The findings will then be mapped onto the Toulmin argumentation model of grounds, warrants, backing, and rebuttals, so that arguments and counter arguments can be posed regarding the claims of automated scoring and feedback systems.
Antony John Kunnan is the author of Test taker characteristics and test performance: A structural modeling approach (Cambridge, 1995), the editor of Validation in language assessment (Lawrence Erlbaum, 1998) and editor of Fairness and validation in language assessment (Cambridge, 1999). He has published journal articles in the Annual Review of Applied Linguistics, Language Testing and the Journal of English as a Foreign Language, test reviews in the Buros' Mental Measurements Yearbook, and book chapters in well-known collections. Prior to his present position, he was postdoctoral fellow at the University of Michigan, Ann Arbor, and a faculty member at the Regional Institute of English, Bangalore, and an English teacher at St. Germain High School in Bangalore. He is currently the editor of Language Assessment Quarterly. Expressions of interest in the area of language assessment are welcome particularly for publication consideration and from thesis students. He is also President of the International Language Testing Association for 2004.
Antony John Kunnan教授是国际语言测评著名专家，曾任国际语言测试协会会长（1998-1999），香港大学荣誉教授，亚洲语言测试协会主席，多本SSCI期刊及国际丛书编委，SSCI期刊Language Assessment Quarterly创刊主编。Kunnan教授在国际语言测评研究界享有很高的声誉。具有很高的研究能力和水平，曾创建语言测试效度理论“公平效度理论”（Fairness and Justice in Language Assessment），被广泛引用和借鉴。