The goal with PLAYA is just to get objects out of PDF, with no dependencies or further analysis. So, over top of PLAYA there is PAVÉS: "PDF, Analyse et Visualisation ... plus Élaborées", I guess?
Anything that deviates from the core mission of "getting objects out
of PDF" goes here, so, hopefully, more interesting analysis and
extraction that may be useful for all of you AI Bros doing
"Partitioning" and "Retrieval-Assisted-Generation" and suchlike
things. But specifically, visualization stuff inspired by the "visual
debugging" features of pdfplumber
but not specifically tied to its
data structures and algorithms.
There will be dependencies. Oh, there will be dependencies.
pip install paves
PAVÉS
is distributed under the terms of the MIT license.