-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer our bibliography data to zbmath, then replace our bibliography with a link to swmath #343
Comments
Overall, it's still like this: our bib data only goes up to 2021. We list 3377 papers in total which cite GAP. In contrast, https://zbmath.org/software/320 has 3782 citations, which of course is more. That said, if I ask for a list only up to 2021 then it contains just 3171 documents, so they are "missing" about 206. Well, at least that many -- it is quite possible that they also have papers that we don't have, and thus that there are more than 206 publications we list and they don't... It would thus still be interesting to write a script as outlined above which downloads the zbMATH list of papers, and which tries to find papers in our list that they don't have... I didn't do that but just manually looked at the data. Unfortunately it seems a lot of the "missing" papers are due to publishers having really bad metadata for many older pages. As in, the list of references on the website of a paper may contain many obvious errors caused by bad OCR or whatnot. zbMATH does not want to fix these reference lists manually, saying that the proper way is to contact publishers. So I attempted that for several papers, but even getting a reply was rare (less than half the cases or so), and getting an actual change even rarer. This just doesn't scale. However, zbMATH has offered to at least add the "GAP" keyword/tag on request. They have already done it in one case in the past upon my request. I have now sent them another email with two dozen items, let's see if they also process those. If yes, then this might be a path forward. Interestingly, after line 54243
Counting the bib items there I find 210. This matches up quite closely with the 206 missing one, but I think it's more of a coincidence -- several of those paper below that mark are actually listed in zbMATH (and MathSciNet) and have dates <= 2021. Still, there are a few items in there which will never be on ZBMath or MathSciNet, e.g.
But I don't know how much this accounts for the "missing" papers. |
Some of the "missing" items are also "Diplomarbeiten" resp. bachelor/master theses which probably will never be listed in zbMATH. It might be a good idea to move those into a separate "database" (= bib file). |
We have mostly removed our bib, but I am leaving this open because I still hope to migrate some of the data to zbMATH, and I am still waiting for them to reply to my email from earlier this week. |
It took a couple weeks but they replied and added all the corrections I sent them (missing DOIs and missing "GAP" tags on papers). Hence https://zbmath.org/?q=si%3A320+py%3A1980-2021 now lists 3194 documents (up from 3171). So that's still about 183 "missing' papers. But at least it is now plausible for someone to go through, identify the "missing" papers and then submit corrections. |
The link: https://zbmath.org/software/320
This list provides basically everything we have at https://www.gap-system.org/Doc/Bib/bib.html and even has additional nice features. And unlike MathSciNet it is free to use
While it has overall more publications than we do, it does miss some -- potentially in some cases papers might not be indexed by them at all, but so far all cases I found were a paper is in our list but in theirs is a matter of missing metadata on their part, i.e., the "tag" "sw:gap" is missing on some papers for whatever reasons.
I have contacted them and in principle I can send them lists of papers that are missing this tag and they'll add it (presumably after some validation, of course).
That leaves the problem as to how we get that list. Of course we can manually check things but there are thousands. So better to automate it. Here is how one could do that:
Script for getting zbmath data
The text was updated successfully, but these errors were encountered: