-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedded Metadata: Add basic LD-JSON support and some missing meta tags #2765
base: master
Are you sure you want to change the base?
Conversation
@@ -688,7 +724,7 @@ function addLowQualityMetadata(doc, newItem) { | |||
} | |||
} | |||
// fall back to "keywords" | |||
if (!newItem.tags.length) { | |||
if (newItem.tags.length == 0) { // this needs to be tested; previously was !...length, which evaluated to true when zero |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might need to be tested; previously was !newItem.tags.length
, which evaluated to true when zero
appeared to work fine on https://www.bloomberg.com/news/articles/2019-09-12/peloton-founder-goes-from-kickstarter-to-a-450-million-fortune
newItem.tags = [newItem.tags]; | ||
} | ||
if (newItem.tags && newItem.tags.length && Zotero.parentTranslator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what Zotero.parentTranslator
was checking for here—I'd think this should run whenever EM is finding a tag string to parse, not only when it's the parent? Unless the point is to only run when it has a child translator so the child translator can choose what to do with the sometimes garbage tags?
@@ -913,11 +954,14 @@ function finalDataCleanup(doc, newItem) { | |||
newItem.tags = tags; | |||
} | |||
} | |||
if (newItem.tags.length > 0) newItem.tags = scrubLowercaseTags(newItem.tags); | |||
|
|||
/* what else is automatically adding tags? if it's bad data, don't import it, but resetting all tags (per below) kills good LD-JSON tags | |||
else { | |||
// Unless called from another translator, don't include automatic tags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What else is automatically adding tags? Resetting all tags here—the default—kills (now) good LD-JSON tags. Better to just avoid anything that adds garbage tags if we want to be vigilant, rather than adding and immediately wiping it out?
Thanks for working on this. We could consider merging this for now, but this isn't really how we want to support JSON-LD. Zotero already supports RDF import — we ideally would just use that. And we actually wrote a JSON-LD translator already in #1897 that does so, and which uses more accurate JSON-LD parsing, but it hasn't yet been merged. There, it gets called from a new Generic translator, which evaluates the data from various sources, including the EM translator. See this comment for discussion of some of the problems with JSON-LD, including the fact that there can be multiple JSON-LD blocks on a page. This will need to be reviewed in the context of #1092. Since those changes will require a change to core code to be rolled out — zotero/zotero#1656 will need to be moved to zotero/utilities — we could consider adding this in the meantime, but we need to confirm that it wouldn't result in bad data. |
In particular, I'd want to hear @mrtcode's thoughts on this. |
… add tests for deprecated single-site translators
meta
tags to remove need for more one-off translators