Web Information Processing and Applications
Instructor
Jin Pei-Quan(金培权) Xu Lin-Li (徐林莉)
Email: jpq@ustc.edu.cn Email: linlixu@ustc.edu.cn
Teaching Assistants
林盛, Ph.D. student 于永波, Master Student
Phone: 13485728758 Phone: 13865979122
Email: linsh@mail.ustc.edu.cn Email: yyb2012@maiul.ustc.edu.cn
Room: 1610, 科技实验楼西楼 Room: 1610, 科技实验楼西楼
Lectures
Time: Class 6 to 8
Classroom: 3C221 (West Campus)
Textbook
W. Bruce Croft, Donald Melzler, Trvor Strohman, Search Engines: Information Retrieval in Practice, Pearson Press, 2010
(中文版:刘挺, 等 译, 搜索引擎:信息检索实践, 机械工业出版社, 2012)
References
Christopher D. Manning, Prabhakar Raghavanm, Hinrich Schütze, An Introduction to Information Retrieval, Cambridge University Press, 2008
(中文版:王斌 译, 信息检索导论, 人民邮电出版社, 2010)
Ricardo Baeza-Yates, Berthier Ribeiro-Neto, Modern Informatio Retrieval, Addison Wesley Longman Publishing Co. Inc., 1999
Bing Liu, Web Data Mining (2nd Edition), Springer, 2011
Some state-of-the-art papers from SIGIR, CIKM, WWW, etc.
Assignments
Some homework assignments. POLICY: all assignments should be completed and submitted in one week, i.e. before the beginning of next class. Late assignment submissions will be penalized 20% points.
Examination
One final test, scheduled to be taken at the end of the course.Grading
Homework: 20%
Lab: 20% [Lab #1 Description. Lab time: 18:30-21:30, Monday and Tuesday, start from 8 October. Lab site: 517, E3 Building]
Final: 60%
Course Notes
No. Date Contents Homework Chapters Reading 1 9.3 Introduction to Web Information Processing Chp.1-2
2 9.10 Web Crawling ( updated)
homework Chp.3
3 9.17 Text Processing homework Chp.4
4 9.24 Indexing & Lab #1 Description homework Chp.5 5 10.1 (National Day) Lab #1
Lab time: 18:30-21:30, Monday and Tuesday.
Location: 517, E3 Building6 10.8 Queries homework Chp.6 7 10.15 Ranking homework Chp.7
8 10.22 Evaluation homework Chp.8 9 10.29 Named Entity Recognition 10 11.5 Relation Extraction 11-18 Web Data Mining 19 Review 20 Final Exam