Scrapes WSB and other investing subreddits for top stocks each week, by mentions or score.
-
Clone this repo wherever you'd like.
-
Create login info:
- Copy the
config_template.py
toconfig,py
- Follow the intructions to retrieve an
api_id
andapi_secret
; updateapi_user_agent
as well
- Copy the
-
Optional: Create & activate venv
python -m venv wsb_scraper_venv
source wsb_scraper_venv/bin/activate
- Install dependencies
pip install praw
pip install pandas
-
Use
./run_wsb_scraper.sh -h
to see the available options and their defaults. -
Single run:
python main,py [options] > scraper.log 2>&1 &
or./run_wsb_scraper.sh [options] > scraper.log 2>&1 &
- ==> activates the
venv
for you
- ==> activates the
-
Multi-run / historical data:
runAll.sh
starts 6 runs: day, week, month, for both scores and mentions, and saves the output tooutput/<date>
- By default,
runAll.sh
:- tries to load
output/<yesterday>/<mention|score>/<day|week|month>/to_buy.txt
as the previous buy list
- tries to load
- All the other options (at their default) can be overwritten from the command line
./runAll.sh [options] > runAll_$(date +"%Y%m%dT%H%M").log 2>&1 &
List ambiguous tickers, and the keywords to check for in the body of the posts (comma-separated list)
List strings that are automatically excluded (for some strings like WSB
,
graylisting them would waste time when probability of actual ticker is very low)
Contains list of words too generic to act as keywords when checking the graylist
Contains the top n
(default 5) tickers for the specified sub
Contains the merged list from all considered subs
Messed up at the moment. Should contain tickers that dropped in mentions