【学术报告】Active Learning with Generalized Queries

发稿时间:2009-05-15浏览次数:4607

5.20学术报告之二Active Learning with Generalized Queries

报告人:  Charles Ling, PhD, Professor,   Director, Data Mining and E-Business Lab,Department of Computer Science,University of Western Ontario,Canada

时间:5月18日(周一)下午3点
地点:蒙民伟楼109室

Abstract:
Active learning can actively select or construct examples (queries) and request their labels to reduce the number of labeled examples needed for building an accurate classifier.  However, previous works of active learning can only ask specific queries with all attribute values, many of which may be irrelevant. A more natural and powerful way is to ask  "generalized queries'' with only relevant attributes, such as ``are people over 50 with knee pain likely to have osteoarthritis?'' (with only two attributes: age and type of pain while omitting many other irrelevant ones, such as fever, blood type, etc.). The power of asking such generalized queries is that one generalized query may be equivalent to many specific ones. However, overly general queries may receive uncertain labels from the oracle, and this makes learning difficult. We propose a novel active learning algorithm that asks good generalized queries. We demonstrate experimentally that our new method asks significantly fewer queries compared with the previous works of active learning. Our method can be
readily deployed in real-world data mining tasks where obtaining labeled examples is costly.

Bio:
Charles X. Ling earned his dual-BSc from Shanghai Jiao Tong Univ in China, and both of his MSc and PhD from Computer and Information Science at Univ of Pennsylvania (Ivy League) within four years. Since then he has been a faculty member in Computer Science at University of Western Ontario, Canada. He is currently a Professor. His main research areas include machine learning and data mining, cognitive modeling, and child education. He has published over 100 research papers in peer-reviewed journals and conferences.
He is an Associate Editor for IEEE TKDE and Computational Intelligence Journal, and IEEE Senior Member. He is the Director of Data Mining and E-Business Lab, leading data Mining development in CRM, Bioinformatics, and the Internet.