Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to use pymarc instead of custom MARC parser #7969

Open
tfmorris opened this issue Jun 14, 2023 · 1 comment
Open

Refactor to use pymarc instead of custom MARC parser #7969

tfmorris opened this issue Jun 14, 2023 · 1 comment
Labels
Lead: @hornc Issues overseen by Charles (Staff: Data Engineering Lead) [managed] Priority: 3 Issues that we can consider at our leisure. [managed] Theme: MARC records Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]

Comments

@tfmorris
Copy link
Contributor

It might make sense for OpenLibrary to stop maintaining a custom MARC parser when there is a well supported robust open source MARC parser available in pymarc. I made the suggestion to switch in 2018 and (twice) again in 2020, but perhaps creating a separate issue will drive some discussion and a decision.

Proposal & Constraints

Replace marc_base.py, marc_binary.py, mnemonics.py, and marc_xml.py with pymarc. Review other modules in openlibrary.catalog.marc for other code which can be eliminated.

Additional context

While it would have been better to do it 5 years ago and avoided all the maintenance effort in the intervening years, it's probably still at net win (and, arguably, "the right thing to do" for the ecosystem).

Stakeholders

@hornc @mekarpeles @cclauss

@tfmorris tfmorris added Needs: Lead Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed] labels Jun 14, 2023
@cdrini cdrini added Priority: 3 Issues that we can consider at our leisure. [managed] Lead: @hornc Issues overseen by Charles (Staff: Data Engineering Lead) [managed] and removed Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Needs: Lead labels Jun 19, 2023
@cdrini
Copy link
Collaborator

cdrini commented Jun 19, 2023

@hornc thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lead: @hornc Issues overseen by Charles (Staff: Data Engineering Lead) [managed] Priority: 3 Issues that we can consider at our leisure. [managed] Theme: MARC records Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]
Projects
None yet
Development

No branches or pull requests

3 participants