This repository contains code for a web scraper to scrape news headlines from the front page of CNBC. Also, this repository contains some analysis for headlines containing the word "coronavirus" including coronavirus correlation with the VIX. The CNBC news headline dataset contains over 100K headlines collected between 2020-2021.
- Python 3.7+
pip3 install -r requirements.txt
- Clone this repository onto your computer
git clone https://github.com/Blauyourmind/cnbc_webspider.git
- Make sure all the above dependencies are installed on your local machine
- Run the cnbc spider from the command line
python3 cnbc_spider.py
- Data will be saved to the cnbc_news.csv file inside the "data" folder which you can use for further analysis
VIX data collected from Yahoo Finance