URLs:
The Script does 3 things:
-
Produces daman_2015.csv and daman_2016.csv that contains metadata about the pdfs. The CSV has the following fields:
year, language, poll_station_no, file_name
-
Downloads all the pdfs to a directory called
daman_201x/
-
Renames the pdfs:
- English language rolls have the prefix
eng
and Gujarati language rolls have the prefixguj
. - The polling station no. is a 3 digit number.
So a sample name = eng_001.pdf
pip install -r requirements.txt
python daman_archives.py
There is no electoral rolls in Gujarati for 2016.