Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Marygrove MARC records into Open Library #2683

Closed
3 tasks done
hornc opened this issue Dec 2, 2019 · 5 comments
Closed
3 tasks done

Load Marygrove MARC records into Open Library #2683

hornc opened this issue Dec 2, 2019 · 5 comments
Assignees
Labels
Module: Import Issues related to the configuration or use of importbot and other bulk import systems. [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Theme: MARC records Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]
Milestone

Comments

@hornc
Copy link
Collaborator

hornc commented Dec 2, 2019

This collection of books from Marygrove College will soon be scanned and in order to enable metadata lookup for that process, and eventually have those books borrowable on OL, we need to create the corresponding data records in advance.

Task List:

@hornc hornc added Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Theme: MARC records Module: Import Issues related to the configuration or use of importbot and other bulk import systems. [managed] labels Dec 2, 2019
@hornc hornc added this to the 2019-Q4 milestone Dec 2, 2019
@hornc hornc self-assigned this Dec 2, 2019
@hornc
Copy link
Collaborator Author

hornc commented Dec 2, 2019

MARC binary data in https://archive.org/details/marc_marygrove

@hornc
Copy link
Collaborator Author

hornc commented Dec 2, 2019

local_id details: https://openlibrary.org/local_ids/mgc

@hornc hornc added the State: Work In Progress This issue is being actively worked on. [managed] label Dec 2, 2019
@hornc
Copy link
Collaborator Author

hornc commented Dec 5, 2019

Have confirmed that Marygrove MARC records are MARC-8 encoded, (yaz-marcdump originally fake-converted the results which made me think they were UTF-8 ) This relates to #713 which I want to add test cases for. This import is blocked until I can confirm we properly handle MARC-8 encoded records.

@hornc
Copy link
Collaborator Author

hornc commented Dec 9, 2019

First test edit:
https://openlibrary.org/books/OL5910281M/Collected_works.?_compare=Compare&b=4&a=3&m=diff

from MARC record at marc_marygrove/metacoll.ERR.new.D20191108.T213022.internetarchive2nd.1.mrc:0:1883

Didn't write local_id... because the source MARC did not have a 876 field.

@hornc
Copy link
Collaborator Author

hornc commented Dec 16, 2019

I had to regenerate the entire MARC collection to enable this by including a custom 976 field in the bibliographic MARC records, and converted to utf8 in the process.

I had to modify the local_id record and the source data
marygrovecollegelibrary.full.D20191108.T213022.internetarchive2nd_REPACK.mrc

Now all ~70k records' local_ids have been imported.

Example lookup by local_id: https://openlibrary.org/api/books.json?bibkeys=local_id:urn:mgc:31927000333986

@hornc hornc closed this as completed Dec 16, 2019
@hornc hornc removed the State: Work In Progress This issue is being actively worked on. [managed] label Dec 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: Import Issues related to the configuration or use of importbot and other bulk import systems. [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Theme: MARC records Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]
Projects
None yet
Development

No branches or pull requests

1 participant