4826 Replicate RECAP PDF uploads to subdockets #4857
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for replicating PDF uploads to sub-dockets, following a similar approach to attachment pages. A step was added in
find_subdocket_pdf_rds
beforeprocess_recap_pdf
to look for subdockets where documents should be merged.The process flow is:
pacer_doc_id
in the same court. This query was moved to a helper method since it's common to bothfind_subdocket_pdf_rds
andfind_subdocket_att_page_rds
.ProcessingQueue
entries are created for each additional RECAPDocument where the PDF needs replication.pacer_case_id
(optional in PDF uploads), one is assigned during the first iteration so the PQ can succeed when processed byprocess_recap_pdf
. Otherwise, the lookup will fail withRECAPDocument.MultipleObjectsReturned
.transaction.atomic
to roll back any objects if errors occur. This change was also applied tofind_subdocket_att_page_rds
.process_recap_attachment
.When working on this I noticed that within
process_recap_pdf
there is the fallback query:courtlistener/cl/recap/tasks.py
Line 264 in cb012e8
Is it correct that this query only use
pacer_doc_id
? Not sure ifpacer_doc_ids
are unique across all courts in PACER. If they're not, I think it would be safer to change it to:rd = await RECAPDocument.objects.aget(pacer_doc_id=pq.pacer_doc_id, court_id=pq.court_id)
?Let me know what do you think.