Quotes Scraper and MongoDB Loader

Overview

This project is designed to scrape quotes and author information from a website, store the data in JSON files, and then load this data into a MongoDB database. The project demonstrates the ability to extract and manage data efficiently, showcasing proficiency in web scraping, data parsing, and database integration.

Installation

Clone the repository.
Install the required Python packages.
Set up MongoDB and ensure it's running.

Usage

Scraping Data

Run the Scrapy spider to scrape the quotes and author information:

python main_crawler.py

This will save the scraped quotes to data/quotes.json and author information to data/authors.json.

Loading Data into MongoDB

Run the data loader script to load the scraped data into MongoDB:

python load_data.py

File Details

`main_crawler.py`

This script initializes and runs the Scrapy spider.

`quotes_scraper/spiders/quotes_spider.py`

The Scrapy spider for scraping quotes and author information.

`load_data.py`

This script reads the scraped data from JSON files and loads it into MongoDB.

Conclusion

This project showcases the ability to build a complete data pipeline from web scraping to database storage, demonstrating proficiency in web scraping, data parsing, and database management.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
quotes_scraper		quotes_scraper
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quotes Scraper and MongoDB Loader

Overview

Installation

Usage

Scraping Data

Loading Data into MongoDB

File Details

`main_crawler.py`

`quotes_scraper/spiders/quotes_spider.py`

`load_data.py`

Conclusion

About

Releases

Packages

Languages

aska197/QuotesScraper

Folders and files

Latest commit

History

Repository files navigation

Quotes Scraper and MongoDB Loader

Overview

Installation

Usage

Scraping Data

Loading Data into MongoDB

File Details

main_crawler.py

quotes_scraper/spiders/quotes_spider.py

load_data.py

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`main_crawler.py`

`quotes_scraper/spiders/quotes_spider.py`

`load_data.py`

Packages