Data Mining, Search, and the World Wide Web

http://infolab.stanford.edu/~sergey/349/


CS 349: Data Mining, Search, and the World Wide Web

http://www-db.stanford.edu/~sergey/cs349.html

Tuesdays and Thursdays 4:15 - 5:30 in Bldg 370, Room 370 on the Main Quad

Instructors: Sergey Brin and Lawrence Page
Tues and Thurs 5:30 - 7:00 or by appointment.
sergey@cs.stanford.edu and page@cs.stanford.edu

Course Assistant: Diane Tang
Gates 416: Mon - Wed 11:15 - 12:15 or by appointment.
dtang@cs.stanford.edu

Description

Over the past two years there has been a close collaboration between the Data Mining Group (MIDAS) and the Digital Libraries Group at Stanford in the area of Web research. It has culminated in the WebBase project whose aims are to maintain a local copy of the World Wide Web (or at least a substantial portion thereof) and to use it as a research tool for information retrieval, data mining, and other applications. This has led to the development of the PageRank algorithm, the Google search engine, the DIPRE algorithm, and a number of other works which represent the cutting edge of research on the Web today (see WebBase Publications).

The topics of this class are data mining and information retrieval in the context of the World Wide Web. First, we will cover background material in data mining and information retrieval that is relevant to the class. Second, we will cover recent advances made at Stanford (PageRank, DIPRE,...) and elsewhere (Kleinberg, Mitchell,...). Third and most important students will get the opportunity to work hands on with the WebBase as this will be a project class. We have already modularized a large part of the code to give people the opportunity to work with it and will continue to do so throughout the summer. Several people have already taken advantage of the code. The current WebBase repository consists of roughly 25 million web pages amounting to 150 GB of HTML.

Prerequisites

  • A strong knowledge of C.
  • Working knowledge of C++.
  • Very basic statistics, graph theory and linear algebra.

Very Tentative Syllabus

Mailing List

Subscribe to Stanford CS 349
Enter your e-mail address:
cs349 Archive
An e-group hosted by FindMail's eGroups.com

Sergey Brin
Last modified: Sat Oct 24 23:18:37 PDT 1998
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值