Scrubs encrypted compressed PDF files for text watermarks and metadata.
- Decrypts the PDF if it's encrypted
- Uncompresses the PDF
- Removes metadata (Xpacket)
- Tries to naively remove text based watermarks by matching objects which number of occurrences, is the same as the PDF page count. If multiple objects match, produce a pdf for each.
- Optionally compresses the PDF again if
--no-compress
is not given as a command line argument.
$ pdf_scrub --help
Usage: pdf_scrub [OPTIONS] FILES...
Arguments:
FILES... [required]
Options:
--compress / --no-compress Compress the final pdf to reduce file size greatly [default: compress]
Requires qpdf
and pdftk
.
For help getting started developing check DEVELOPMENT.md