Skip to content

Andrey-Kugubaev/scrapy_parser_pep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Asynchronous parser of PEP document


python Scrapy Regex flake8

Contents:


Introduction

The project to write web page parser using Scrapy. The project implements a parser for collecting version information PEP (Python Enhancement Proposals)


Parsing PEP documents

The parser collects information (Number, Name и Status) about PEP documents from website https://peps.python.org/ and saves in csv file (directory result). The format of file name: pep(%datetime%).csv

Also parser counts the number of documents with a certain status, counts the total number of PEP documents and saves this information in a csv file (directory result) The format of file name: status_summary(%datetime%).csv


Instruction to start

  1. Clone the repository to the local machine git clone [email protected]:Andrey-Kugubaev/scrapy_parser_pep.git
  2. Install and activate the virtual environment python -m venv venv or python3 -m venv venv, then source venv/Scripts/activate or source venv/bin/activate
  3. Install Dependencies pip install -r requirements.txt
  4. Run parsers scrapy crawl pep

About

Asynchronous parser of PEP document

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages