Public Datasets for Data Mining
Posted by:
Mark
Date: October 10, 2009 10:51AM
Here is some good links for some
public datasets for testing data mining algorithms:
All the datasets of the previous SIGKDD KDD-CUP. Some of them are very large!
The UCI KDD Archive that contains around 40 larges datasets of different types.
The UCI Machine Learning repository offers 185 public datasets.
The Frequent Itemsets Mining Datasets Repository provides many classical datasets for frequent itemset mining. It also includes several implementations of classical algorithm such as Apriori and others for mining frequent itemsets, frequent closed itemsets, maximal itemsets, etc.
A few datasets and implementations of data mining algorithms can also be found on the web page of M. Zaki.
All the datasets of the previous SIGKDD KDD-CUP. Some of them are very large!
The UCI KDD Archive that contains around 40 larges datasets of different types.
The UCI Machine Learning repository offers 185 public datasets.
The Frequent Itemsets Mining Datasets Repository provides many classical datasets for frequent itemset mining. It also includes several implementations of classical algorithm such as Apriori and others for mining frequent itemsets, frequent closed itemsets, maximal itemsets, etc.
A few datasets and implementations of data mining algorithms can also be found on the web page of M. Zaki.
A lots of links to free datasets : http://www.kdnuggets.com/datasets/index.html
University of Edinburgh Data Mining datasets : http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html
PSLC Datashop, datasets from the learning science community : https://pslcdatashop.web.cmu.edu/
Yale Face database : http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html
A list of datasets : http://www.datawrangling.com/some-datasets-available-on-the-web
Some biomedical datasets:
http://datam.i2r.a-star.edu.sg/datasets/krbd/