Skip to content

Latest commit

 

History

History
44 lines (37 loc) · 1.5 KB

README.md

File metadata and controls

44 lines (37 loc) · 1.5 KB

Asynchronous parser of PEP document


python Scrapy Regex flake8

Contents:


Introduction

The project to write web page parser using Scrapy. The project implements a parser for collecting version information PEP (Python Enhancement Proposals)


Parsing PEP documents

The parser collects information (Number, Name и Status) about PEP documents from website https://peps.python.org/ and saves in csv file (directory result). The format of file name: pep(%datetime%).csv

Also parser counts the number of documents with a certain status, counts the total number of PEP documents and saves this information in a csv file (directory result) The format of file name: status_summary(%datetime%).csv


Instruction to start

  1. Clone the repository to the local machine git clone [email protected]:Andrey-Kugubaev/scrapy_parser_pep.git
  2. Install and activate the virtual environment python -m venv venv or python3 -m venv venv, then source venv/Scripts/activate or source venv/bin/activate
  3. Install Dependencies pip install -r requirements.txt
  4. Run parsers scrapy crawl pep