-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement automated storing to db/backend #70
Comments
Branch issue-70 created! |
The final layer missing here is the automatic update of the backend. One solution would be to hash all papers and let the backend return a list of all hashes that the crawler can compare to without sending any other request. Then the crawler can decide what to update/write which results in few requests per update. |
So far we used the code I created on branch https://github.com/gipplab/cs-insights-crawler/tree/data-upload-full in the file upload/d3_full.py. There might be some helpful things there, that could help with this issue. |
@muhammadtalha242 Is the new data ingestion through SemanticScholar already ready? |
Is your feature request related to a problem? Please describe.
We need to store author, venue, and publication data into our backend automatically when the next d3 version is released.
Describe the solution you'd like
Implement a backend class that:
Additional context
https://www.mongodb.com/docs/database-tools/mongoimport/#std-label-ex-mongoimport-merge
The text was updated successfully, but these errors were encountered: