General Repositories & Benchmarks

This page contains links to general dataset repositories that contain different types of datasets. It also provides links to some benchmark datasets.


  1. UCI Repository link
  2. UCI KDD Repository (large datasets) link
  3. WIKIPOSIT Dataset List link
  4. Public Data Sets on Amazon web Service (large datasets) link
  5. Delve Repository (Classification, Regression) link
  6. Infochimps Open Catalog link
  7. Kevin Chai Dataset Catalog link
  8. STATOO Dataset List link1 link2
  9. Digging Into Data Repository link
  10. Clustering Datasets by Koln University  link
  11. Clustering Datasets link
  12. Frequent Itemset Mining link


  1. Gunnar Raetsch's Benchmark Datasets (Classification) link
  2. Meyer Benchmark Datasets (Classification and Regression) link
  3. Fundamental Clustering Problem Suite (Clustering) link