Skip to content

Latest commit

 

History

History
29 lines (22 loc) · 950 Bytes

README.md

File metadata and controls

29 lines (22 loc) · 950 Bytes

CNBC News Headline Web Scraper and COVID-19 Analysis

Work in Progress

Description

This repository contains code for a web scraper to scrape news headlines from the front page of CNBC. Also, this repository contains some analysis for headlines containing the word "coronavirus" including coronavirus correlation with the VIX. The CNBC news headline dataset contains over 100K headlines collected between 2020-2021.

Dependencies

  1. Python 3.7+
pip3 install -r requirements.txt

Getting Started

  1. Clone this repository onto your computer
git clone https://github.com/Blauyourmind/cnbc_webspider.git
  1. Make sure all the above dependencies are installed on your local machine
  2. Run the cnbc spider from the command line
python3 cnbc_spider.py
  1. Data will be saved to the cnbc_news.csv file inside the "data" folder which you can use for further analysis

External Data

VIX data collected from Yahoo Finance