Skip to content

Download an entire book (or publication) in PDF file from Hathi Trust Digital Library without "partner login" requirement

License

Notifications You must be signed in to change notification settings

lucasguillermo/hathitrustPDF

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Hathi Trust Digital Library - Complete PDF Download

Download an entire book (or publication) in PDF from Hathi Trust Digital Library without "partner login" requirement.

Motivation

Hathi Trust Digital Library is a good site to find old publications digitized from different university libraries. However, it limits the download of full PDF files to only partner universities, which are mostly american. In this sense, this code attempts to democratize knowledge and permits to download complete public domain works in PDF from Hathi Trust website.

Requirements

How to use it

Copy Hathi Trust book URL and paste into "link" variable on code line:

...
link = "https://babel.hathitrust.org/cgi/pt?id=mdp.39015023320164"
r  = requests.get(link)
...

OBS: Keep the same pattern presented (numbers at the end)!

After that, all pages will be downloaded as PDF files and merged in a single file named BOOKNAME_output.pdf in the corresponding folder. The individual pages are not deleted after the end of the process!

Slice pages

The code also allows you to remove only a range of pages. For that purpose, just edit the start and end page on code line:

...
# Download pdf file
begin_page=1
last_page=pages_book+1

for actual_page in range(begin_page, last_page):
...

Screenshot

captura-hait

About

Download an entire book (or publication) in PDF file from Hathi Trust Digital Library without "partner login" requirement

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%