-
This repo is used to store personal project, it contains some practice of crawling data and process data by using Python.
-
Pagerank application steps:
- run the crawler.py to retrieve some links, you need to set a page as the start point to crawl, and how many pages you want to crawl, default start point is https://www.google.com
- after run the crawler, run the rank.py to start pagerank, you need to set the maximum numbers of iterations, default is 10.
- reset the database if you want to start a new application.