Skip to content
This repository has been archived by the owner on Feb 4, 2020. It is now read-only.

API for working with XML data isn't very intuitive #73

Open
danmichaelo opened this issue Jul 27, 2015 · 5 comments
Open

API for working with XML data isn't very intuitive #73

danmichaelo opened this issue Jul 27, 2015 · 5 comments

Comments

@danmichaelo
Copy link
Contributor

Having some xml data,

data = open('test.xml', 'rb')

, I expected from the README example to be able to do something like

from pymarc import MARCReader
for record in MARCReader(data):
    ...

but instead I had to do

import pymarc
for record in pymarc.parse_xml_to_array(data):
    ...

Determining the file type should be quite easy from reading the first characters of the file stream: xml if "<?xml", json if "{", plain marc otherwise.

Next I wanted to try to serialize a record to XML. The Record object has methods like as_marc(), as_marc21() and hm, even as_json(), but no as_xml()! Instead:

pymarc.record_to_xml(record)
@edsu
Copy link
Owner

edsu commented Jul 27, 2015

Yeah, that's a fair criticism. Still, you figured it out -- so maybe it's not so bad? Or maybe most people give up before you? I guess I secretly loathe XML, and like keeping it in a corner.

@danmichaelo
Copy link
Contributor Author

We all do ;) But then there's library systems…

Anyways, I won't be offended if you close this as "wontfix", but consider adding a short xml example to the README first. Might help save others some time.

@edsu
Copy link
Owner

edsu commented Jul 27, 2015

I'll leave this open until one of those things happen. Thanks @danmichaelo !

@rlskoeser
Copy link

Seconding that reading xml is not intuitive. Had to look at the test code and this issue before I got it to work.

@edsu
Copy link
Owner

edsu commented Apr 30, 2019

Just out of curiosity would people be ok with:

for record in MARCReader(open('batch.xml', 'rb')):
    # do something useful with a Record

creating an in memory array for all the records, and then allowing the iteration to start? I think it would be preferable if it did function as an iterator, but I'm not quite sure how that would combine with the SAX parsing that's going on. I like that MARCReader is an actual iterator, and allows you to process large files. I think this another subconscious reason why I partitioned the XML functionality off to the side.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants