Skip to content

marioseixas/scientific-paper-pdf-rename

 
 

Repository files navigation

Scientific Papers PDF Rename

A simple command-line tool to rename ALL your scientific papers.

renamer

Now you can organize and rename ALL those scientific papers you have downloaded once, but you didn’t have the time to organize and rename one-by-one in your directories.

Introduction

Scientific Paper PDF Rename (sci_rename for short) is a command line tool that helps you to rename all your scientific paper according to their titles. The proposed tool scans the PDF file of your papers in order to find its title and rename the pdf file of the paper accordingly.

Approach

sci_rename looks for the title at two different places. First it tries to find the title using the pdf metadata. When pdf files are carefuly created the title of the paper can be included into the pdf metadata. Unfortunately, this practice not as frequent as we expect. To overcome that we use the following assumption. The title of a scientific paper is usually formatted with the largest font of the text. Therefore, sci_rename also scans the first page of the paper and try to identify a bounded size sentence with the largest font size. If it finds a potential title in both cases (i.e., metadata and larger font approach), it asks the user to choose one between them.

Dependencies

We recommend using Python 3.11.6 and PyMuPDF 1.22.5.

However, we have tested the following versions.

Python version

PyMuPDF required

3.11.6

1.22.5

3.8.18

1.22.5

3.7.17

1.22.5

3.7.0

1.18.14

|:boom: DANGER |: Using unverified versions may lead to a malfunctioning installation !!

If your version is higher than you can use pyenv to install multiple versions of Python in your machine.

If your current version doesn’t align with these requirements, you can utilize pyenv to install multiple Python versions on your machine and execute it within a virtualized environment. Install the dependency as recommended in the following.

Installation

  1. Clone the project in your machine

    $ git clone https://github.com/heversonbr/scientific-paper-pdf-rename.git
  2. Enter in the created directory

    $ cd scientific-paper-pdf-rename
  3. Set a python virtual environment to avoid mixing the required packages with your global installed packages.

    $ python3 -m venv .venv
  4. Activate your virtual environment

    $ source .venv/bin/activate
  5. Check the installed packages and update you pip , if required.

    $ pip list
    $ python3 -m pip install --upgrade pip
  6. Install the dependencies.

    $ python3 -m pip install -r requirements.txt

Usage

You can run the script to rename a simple file or a target directory as follows.

all files in directory
$ python3 sci_rename.py <target_directory>

or

single file
$ python3 sci_rename.py <pdf_filename.pdf>

The output will be in a directory called auto_renamed. Use the examples directory to test the renaming tool if you wish.

Feedback

All feedbacks are welcome. It was only tested on Mac OSx and Linux. I did this simple tool to try avoiding the hell of looking for a specific paper I have downloaded before and got lost among many others papers downloaded. I hope it will help.

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.2%
  • Shell 5.8%