(Feature Request) PyMuPDF for pdf parsing #262
Labels
dependencies
Pull requests that update a dependency file
enhancement
New feature or request
help wanted
Extra attention is needed
python
Currently we are using PDFMiner for text extraction from the pdf. There are other libraries that are faster like PyMuPDF that we could consider. It'll also help generate the report faster for the web app that way.
Reference:
https://github.com/py-pdf/benchmarks#pdf-library-benchmarks
The text was updated successfully, but these errors were encountered: