Skip to content

Latest commit

 

History

History
26 lines (19 loc) · 2.19 KB

README.md

File metadata and controls

26 lines (19 loc) · 2.19 KB

WISeWEBSpider version 0.4 Build StatusPython Version

Original Authors: Jerod Parrent, James Guillochon

###Description wisewebspider is a simple program built to scrape and download all publicly available supernova spectra from the Weizmann Interactive Supernova data REPository (WISeREP); a bulk download option is not available through WISeREP and the number of supernova spectra to download are in the 10,000s. The script creates one main directories, sne-external-WISEREP/, where spectra are stored in individual subdirectories alongside README.json files. The README files detail event metadata for each spectrum collected and keep track of the number of private spectra. Also stored in sne-external-WISEREP/ are log files and a lists.json file to keep track of the scripts progress, as well as non-supernova events to save time.

The script guards against spectra already collected, duplicate files found on WISeREP, and events that are not supernovae. However, no effort has been made to collate spectra for objects with multiple aliases (e.g., SN2011fe and PTF11kly, both the same event, have separate directories), nor does the script determine supernova types for objects that are unspecified on WISeREP.

Without excluding by event type and/or survey program (UCB, CfA, SuSpect, etc), the full runtime for scraping everything is about 18.7 hours. Fortunately this need only be done once. After an initial scrape, the script can be run in update mode, which at most takes a few minutes.

###Usage For the initial scrape, check/edit exlcluded lists in main.py, then run:

python3.5 -m wisewebspider

To run in update mode:

python3.5 -m wisewebspider --update --daysago 30

where the value after daysago can be 1, 2, 7, 14, 30, 180, or 365, i.e., how many days since last you scraped.

Dependencies and Credits