Skip to content

lzf14250/python-web-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Python web crawler

  • This repo is used to store personal project, it contains some practice of crawling data and process data by using Python.

  • Pagerank application steps:

  1. run the crawler.py to retrieve some links, you need to set a page as the start point to crawl, and how many pages you want to crawl, default start point is https://www.google.com
  2. after run the crawler, run the rank.py to start pagerank, you need to set the maximum numbers of iterations, default is 10.
  3. reset the database if you want to start a new application.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages