You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try open pdfs files to query data from it and then use that data to rename the pdf-file.
On windows this code fails with renaming cause the file is locked.
On linux the code is working.
I cannot see if this error belongs to pdfquery itself or an other module used by pdfquery is causing this.
import os
import pdfquery
def is_pdf(file):
if os.path.splitext(file.lower())[1] == '.pdf':
return True
pdf_files = os.listdir('./pages')
for pdf_file in filter(is_pdf, pdf_files):
print(pdf_file)
pdf = pdfquery.PDFQuery(os.path.join('pages', pdf_file))
pdf.load()
for e in pdf.tree.iter():
text = e.text
if text:
text = text.replace(' ', '')
if text[0:7] == '4002629':
#del pdf
os.rename(os.path.join('pages', pdf_file),
'{}.pdf'.format(text))
break
Error on windows:
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\PDFs_aufbereiten\pdf_pages_rename.py", line 22, in <module>
os.rename(os.path.join('pages', pdf_file), '{}.pdf'.format(text))
PermissionError: [WinError 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: 'pages\\xxxxxxxxxxxxxxxxxxxx.pdf' -> 'xxxxxxxxxxxxx.pdf'
Code on linux is working.
The text was updated successfully, but these errors were encountered:
Workaround open/close the file by own code before using pdfquery.PDFQuery (thanks to nedbat):
import os
import pdfquery
import time
def is_pdf(file):
if os.path.splitext(file.lower())[1] == '.pdf':
return True
rename_files = []
pdf_files = os.listdir('./pages')
for pdf_file in filter(is_pdf, pdf_files):
print(pdf_file)
with open(os.path.join('pages', pdf_file), 'rb') as myfile:
pdf = pdfquery.PDFQuery(myfile)
pdf.load()
for e in pdf.tree.iter():
text = e.text
if text:
text = text.replace(' ', '')
if text[0:7] == '4002629':
rename_files.append(
(pdf_file, '{}.pdf'.format(text))
)
break
for oldname, newname in rename_files:
os.rename(os.path.join('pages', oldname),
os.path.join('pages', newname)
)
I try open pdfs files to query data from it and then use that data to rename the pdf-file.
On windows this code fails with renaming cause the file is locked.
On linux the code is working.
I cannot see if this error belongs to pdfquery itself or an other module used by pdfquery is causing this.
Error on windows:
Code on linux is working.
The text was updated successfully, but these errors were encountered: