Skip to content

Python code for saving the official AO3 data dump into smaller files, filtered by year.

License

Notifications You must be signed in to change notification settings

amecreate/AO3-Data-Dump-By-Year

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AO3-Data-Dump-By-Year

Python code for saving the official AO3 data dump into smaller files, filtered by year. (2008-2021)

Original Data

Check out the official post "Selective data dump for fan statisticians" on AO3 website.

The data released by AO3 comes in two CSV files.

The first includes information about works:

  • creation date
  • language
  • word count
  • restricted or not
  • complete or not
  • associated tag IDs

The second provides the key to the tag IDs:

  • tag ID
  • tag type (e.g. Warning, Fandom, Relationship)
  • tag name (unless the tag has fewer than 5 uses)
  • canonical or not
  • an approximate number of uses
  • merger ID (i.e. the tag's canonical version, if it has one)

Filtered Data

The python code Works filters the first file by year and saves to separate, smaller csv files.

The python code Tags filters the second file by type and saves to separate, smaller csv files.

You can download these smaller csv files from MEGA.

More analysis and data visualization can be found on my repo AO3-Data-Vis and my website A Look Into AO3 Data.

About

Python code for saving the official AO3 data dump into smaller files, filtered by year.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published