美国政府数据 http://www.data.gov/
Information Network
proximity DBLP http://kdl.cs.umass.edu/data/dblp/dblp-info.html
DBLP-Citation-Network http://arnetminer.org/citation
CiteSeer (hardly) http://csxstatic.ist.psu.edu/about/data
CiteSeer dumped http://martinharrigan.blogspot.com/2008/07/citeseers-dataset.html
Cora (hardly) http://people.cs.umass.edu/~mccallum/data.html
Social Network
Stanford large network dataset (contains lots of network dataset): http://snap.stanford.edu/data/
Stanford class resources http://snap.stanford.edu/na09/resources.html
ICWSM twitter dataset: http://twitter.mpi-sws.org/data-icwsm2010.html
EBSN - Event-based social network dataset: http://www.largenetwork.org/ebsn
Other social network dataset: Slashdot, Enron email, Mit mobile, Epinions reviews.
Sentiment and Option Mining
Bing Liu's homepage
Movie Review http://www.cs.cornell.edu/people/pabo/movie-review-data/
Lee's homepage
twitter sentiment: http://www.sananalytics.com/lab/twitter-sentiment/
Recommendation
Machine Learning
UCI dataset http://archive.ics.uci.edu/ml/datasets.html
Audio Retrieval
CAL-500: http://twitterdata.org/
Million song dataset http://labrosa.ee.columbia.edu/millionsong/
Miscellaneous1
A lot graph dataset including several cups, twitter etc http://graphlab.org/downloads/datasets/
Several graph dataset http://law.di.unimi.it/datasets.php
Delicious/Flikr/Last.FM etc http://www.tagora-project.eu/data/
A small dataset about links http://www.cs.umd.edu/projects/linqs/projects/lbc/index.html
A small dataset including citeseerx/imdb http://komarix.org/ac/ds/
Miscellaneous2
Only user-object
Amazon
Both user-user and user-object
single-type user netwrok
Flickr, Youtube, twitter
signed user network
Epinion, Slashdot, Ciao
Multi-type user network
Facebook, Google plus