Skip to content

Blauyourmind/cnbc_webspider

Repository files navigation

CNBC News Headline Web Scraper and COVID-19 Analysis

Work in Progress

Description

This repository contains code for a web scraper to scrape news headlines from the front page of CNBC. Also, this repository contains some analysis for headlines containing the word "coronavirus" including coronavirus correlation with the VIX. The CNBC news headline dataset contains over 100K headlines collected between 2020-2021.

Dependencies

  1. Python 3.7+
pip3 install -r requirements.txt

Getting Started

  1. Clone this repository onto your computer
git clone https://github.com/Blauyourmind/cnbc_webspider.git
  1. Make sure all the above dependencies are installed on your local machine
  2. Run the cnbc spider from the command line
python3 cnbc_spider.py
  1. Data will be saved to the cnbc_news.csv file inside the "data" folder which you can use for further analysis

External Data

VIX data collected from Yahoo Finance

About

Blauyourmind/cnbc_webspider

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published