datasets

Various unique "real-world" datasets specifically for deep learning purpose. The files are provided in .msgpack format and can be also used separately from Autonomio for example with Pandas:

import pandas as pd
pd.read_msgpack('https://github.com/autonomio/datasets/raw/master/autonomio-datasets/election_in_twitter')

'election_in_twitter'

Dataset consisting of 10 minute samples of 80 million tweets from the beginning of November 2016 to end of December 2016. The keywords used to capture tweets are 'Trump' and 'Hillary'.

'tweet_sentiment'

Dataset with tweet text classified for sentiment using NLTK Vader including word2vec word vectors for each tweet using spaCy.

'sites_category_and_vec'

4,000 sites with word vectors and 5 categories.

'programmatic_ad_fraud'

Data from both buy and sell side and over 10 other sources.

'parties_and_employment'

9 years of monthly poll and unemployment numbers.

'random_tweets'

20,000 tweets with various data colums related with tweet quality, including if the tweet is from a bot or not.

'kaggle_titanic_train'

The train dataset provided as part of the hugely popular Kaggle Titanic Survitor prediction challenge.

'sites_and_vec'

20,000 sites with word vectors based on the landing page content.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
autonomio-datasets		autonomio-datasets
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

datasets

'election_in_twitter'

'tweet_sentiment'

'sites_category_and_vec'

'programmatic_ad_fraud'

'parties_and_employment'

'random_tweets'

'kaggle_titanic_train'

'sites_and_vec'

About

Releases

Packages

Contributors 2

autonomio/datasets

Folders and files

Latest commit

History

Repository files navigation

datasets

'election_in_twitter'

'tweet_sentiment'

'sites_category_and_vec'

'programmatic_ad_fraud'

'parties_and_employment'

'random_tweets'

'kaggle_titanic_train'

'sites_and_vec'

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages