Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(extract_from_text): now returns a plain citation string #1298

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

grossir
Copy link
Contributor

@grossir grossir commented Jan 11, 2025

Solves #1297

We will use eyecite on Courtlistener's site to parse and validate the citation string

Solves #1297

We will use eyecite on Courtlistener's site to parse and validate the citation string
@grossir
Copy link
Contributor Author

grossir commented Jan 11, 2025

Spent some time thinking if the "Citation" value should be a string or a dict...

{
"Citation": {"citation" : "2023 VT 3"}
}

or

{
"Citation": "2023 VT 3"
}

I couldn't really answer, maybe you have an opinion?

@grossir grossir requested a review from flooie January 11, 2025 01:17
@flooie
Copy link
Contributor

flooie commented Jan 13, 2025

is there a corresponding PR in CL to review?

@grossir
Copy link
Contributor Author

grossir commented Jan 13, 2025

Just up here
freelawproject/courtlistener#4913

Comment on lines 116 to +122
"""
metadata = {}
regex = r"(?P<volume>20\d{2})\s(?P<reporter>ND)\s(?P<page>\d+)"
regex = r"20\d{2}\sND\s\d+"
citation_match = re.search(regex, scraped_text[:1000])

if citation_match:
# type 8 is a neutral citation in Courtlistener
metadata["Citation"] = {**citation_match.groupdict(), "type": 8}
metadata["Citation"] = citation_match.group(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like ND doesnt need this anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw that ND has changed; it will require further changes to clean the case name
Do you mind if I merge this and do the nd changes in a different branch?

@grossir grossir merged commit c742dce into main Jan 14, 2025
12 checks passed
@grossir grossir deleted the extract_from_text_citations branch January 14, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants