Web and Social Media Datasets

This page contains links to various web mining datasets.


  1. CMU WWW Knowledgebase link
  2. Microsoft Learning to Rank Repository (LETOR) link
  3. Entire WikiPedia link
  4. 44 Million Blog Posts by ICWSM 2009 link
  5. SourceForge.net Research data link
  6. Computer Science Department Pages link
  7. Google Flu Trend link
  8. Database of Several Million Human Feelings link
  9. BBC, Digg, MySpace Sentiments link
  10. Linked Data link