Skip to content

A lightweight Python utility for effortlessly merging multiple PDF files into a single document.

License

Notifications You must be signed in to change notification settings

BjornMelin/pdfusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

37 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“„ PDFusion

A lightweight Python utility for effortlessly merging multiple PDF files into a single document.

MIT License Python 3.11 Contributions Welcome GitHub LinkedIn Code style: black Ruff

๐Ÿ“‹ Table of Contents

๐Ÿ“ Description

PDFusion is a simple yet powerful command-line tool that makes it easy to combine multiple PDF files into a single document while preserving the original quality. Perfect for combining reports, consolidating documentation, or organizing digital paperwork.

๐Ÿš€ Key Features

  • ๐Ÿ“ Merge all PDFs in a directory with a single command
  • ๐Ÿ”„ Automatic alphabetical ordering of files
  • โฑ๏ธ Timestamp-based output naming option
  • ๐Ÿ› ๏ธ Both CLI and Python API support
  • ๐Ÿ’ก Clear progress feedback and error handling
  • ๐Ÿ”’ Maintains original PDF quality
  • ๐Ÿ“ Detailed logging of the merge process
  • ๐Ÿ” Type hints with full mypy support
  • ๐Ÿงช Comprehensive test coverage (>90%)
  • ๐Ÿ“Š Performance benchmarks included
  • ๐Ÿ› Custom exception handling
  • ๐ŸŽฏ Supports Python 3.11+

๐Ÿ“‚ Repository Structure

graph TD
    A[pdfusion/] --> B[pdfusion/]
    A --> C[tests/]
    A --> D[examples/]
    A --> E[Documentation]
    
    B --> B1[__init__.py]
    B --> B2[exceptions.py]
    B --> B3[logging.py]
    B --> B4[pdfusion.py]
    B --> B5[py.typed]
    
    C --> C1[__init__.py]
    C --> C2[conftest.py]
    C --> C3[test files]
    
    D --> D1[basic_usage.py]
    
    E --> E1[README.md]
    E --> E2[LICENSE]
    E --> E3[CONTRIBUTING.md]
    E --> E4[Configuration Files]
Loading

๐Ÿ’ป Installation

For Users ๐ŸŒŸ

pip install pdfusion

For Developers ๐Ÿ”ง

graph LR
    A[Clone Repository] --> B[Create Virtual Environment]
    B --> C[Activate Environment]
    C --> D[Install Dependencies]
    D --> E[Ready to Develop!]
Loading
  1. Clone the repository:

    git clone https://github.com/BjornMelin/pdfusion.git
    cd pdfusion
  2. Create a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: .\venv\Scripts\activate

    Note: You can also use virtualenv instead of venv. See the Virtual Environment Setup Guide for more details.

  3. Install development dependencies:

    pip install -r requirements-dev.txt

๐ŸŽฎ Usage

Quick Start Guide

  1. Install PDFusion

    pip install pdfusion
  2. Prepare Your PDFs

    • Create a directory with your PDF files

    • Example structure:

      my_pdfs/
      โ”œโ”€โ”€ document1.pdf
      โ”œโ”€โ”€ document2.pdf
      โ””โ”€โ”€ document3.pdf
      
  3. Run PDFusion

Command Line Interface

graph LR
    A[Input Directory] --> B[PDFusion CLI]
    B --> C[Processing]
    C --> D[Merged PDF]
    style B fill:#f9f,stroke:#333,stroke-width:4px
Loading
# Basic usage
pdfusion /path/to/pdfs -o merged.pdf

# With verbose output
pdfusion /path/to/pdfs -v

# Auto timestamp filename
pdfusion /path/to/pdfs

CLI Options

  • -o, --output: Output filename (optional)
  • -v, --verbose: Enable verbose output
  • --version: Show version number
  • -h, --help: Show help message

Python API

from pdfusion import merge_pdfs

# Example 1: Basic usage
result = merge_pdfs(
    input_dir="/path/to/pdfs",
    output_file="merged.pdf"
)
print(f"Merged {result.files_merged} files into {result.output_path}")

# Example 2: With verbose output and auto timestamp
result = merge_pdfs(
    input_dir="/path/to/pdfs",
    verbose=True
)
print(f"Total pages in merged PDF: {result.total_pages}")

# Example 3: Full options
result = merge_pdfs(
    input_dir="/path/to/pdfs",
    output_file="merged.pdf",
    verbose=True,
    sort_files=True,  # Sort files alphabetically
    add_bookmarks=True  # Add bookmarks for each merged PDF
)

Example Project Structure

Create a simple script merge_my_pdfs.py:

from pdfusion import merge_pdfs
import logging

# Set up logging (optional)
logging.basicConfig(level=logging.INFO)

# Merge PDFs
try:
    result = merge_pdfs(
        input_dir="./my_pdfs",
        output_file="merged_document.pdf",
        verbose=True
    )
    print(f"Successfully merged {result.files_merged} files!")
    print(f"Output saved to: {result.output_path}")
    print(f"Total pages: {result.total_pages}")

except Exception as e:
    print(f"Error merging PDFs: {e}")

Run your script:

python merge_my_pdfs.py

Output Format

The merge_pdfs function returns a result object with the following attributes:

  • files_merged: Number of files merged
  • output_path: Path to the merged PDF
  • total_pages: Total number of pages in the merged PDF
  • processing_time: Time taken to merge the PDFs

๐Ÿ› ๏ธ Development

Running Tests

# Run all tests
pytest

# Run with coverage report
pytest --cov=pdfusion

# Run performance benchmarks
pytest tests/test_pdfusion.py -v -m benchmark

# Run specific test file
pytest tests/test_pdfusion.py -v

๐Ÿค Contributing

graph LR
    A[Fork Repository] --> B[Create Feature Branch]
    B --> C[Make Changes]
    C --> D[Commit Changes]
    D --> E[Push to Branch]
    E --> F[Open Pull Request]
    style F fill:#f96,stroke:#333,stroke-width:4px
Loading
  1. Fork the repository
  2. Create your feature branch (git checkout -b feat/version/AmazingFeature)
  3. Commit your changes (git commit -m 'type(scope): Add some AmazingFeature')
  4. Push to the branch (git push origin feat/version/AmazingFeature)
  5. Open a Pull Request (feat(scope): Add some AmazingFeature)

๐Ÿ‘จโ€๐Ÿ’ป Author

Bjorn Melin

AWS Certified Solutions Architect AWS Certified Developer AWS Certified AI Practitioner AWS Certified Cloud Practitioner

AWS-certified Solutions Architect and Developer with expertise in cloud architecture and modern development practices. Connect with me on:

Project Link: https://github.com/BjornMelin/pdfusion

๐Ÿ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐ŸŒŸ Star History

Star History Chart

๐Ÿ™ Acknowledgments

โšก Built with Python 3.11 + pypdf2 by Bjorn Melin