Focus on collecting different public database for research. If you have any links please contact me or push to the repository.
- Reddit Comments Corpus;
- Full Reddit Submission Corpus;
- City Record Online;
- TLC Trip Record Data;
- Frequency Word Lists;
- Amazon product data;
- Wikimedia database;
- Airbnb database;
- Driving in the Cloud Dataset;
- Nothink Malware samples
- SecRepo.com - Samples of Security Related Data ****
- lanl.gov Open Data Sets;
- Crime data from the St. Louis Metropolitan Police Departments;
- Chronology of Data Breaches Security Breaches 2005 - Present;
- Malware Sample Sources for Researchers;
- Microsoft Malware Classification Challenge (BIG 2015);
- Android Malware-The Drebin Dataset;
- Social networks : online social networks, edges represent interactions between people
- Networks with ground-truth communities : ground-truth network communities in social and information networks
- Communication networks : email communication networks with edges representing communication
- Citation networks : nodes represent papers, edges represent citations
- Collaboration networks : nodes represent scientists, edges represent collaborations (co-authoring a paper)
- Web graphs : nodes represent webpages and edges are hyperlinks
- Amazon networks : nodes represent products and edges link commonly co-purchased products
- Internet networks : nodes represent computers and edges communication
- Road networks : nodes represent intersections and edges roads connecting the intersections
- Autonomous systems : graphs of the internet
- Signed networks : networks with positive and negative edges (friend/foe, trust/distrust)
- Location-based online social networks : Social networks with geographic check-ins
- Wikipedia networks and metadata : Talk, editing and voting data from Wikipedia
- Twitter and Memetracker : Memetracker phrases, links and 467 million Tweets
- Online communities : Data from online communities such as Reddit and Flickr
- Online reviews : Data from online review systems such as BeerAdvocate and Amazon