Skip to content

Latest commit

 

History

History
205 lines (144 loc) · 6.04 KB

README.md

File metadata and controls

205 lines (144 loc) · 6.04 KB

richcontext.scholapi

Rich Context API integrations for federating discovery services and metadata exchange across multiple scholarly infrastructure providers.

Development of the Rich Context knowledge graph uses this library to:

  • identify dataset links to research publications
  • locate open access publications
  • reconcile journal references
  • reconcile author profiles
  • reconcile keyword taxonomy

This library has been guided by collaborative work on community building and metadata exchange to improve Scholarly Infrastructure, held at the 2019 Rich Context Workshop.

Installation

Prerequisites:

To install from PyPi:

pip install richcontext.scholapi

If you install directly from this Git repo, be sure to install the dependencies as well:

pip install -r requirements.txt

Then copy the configuration file template rc_template.cfg to rc.cfg and populate it with your credentials.

NB: be careful not to commit the rc.cfg file in Git since by definition it will contain sensitive data, e.g., your passwords.

Parameters used in the configuration file include:

parameter value
chrome_exe_path path/to/chrome.exe
core_apikey CORE API key
dimensions_password Dimensions API password
elsevier_api_key Elsvier API key
email personal email address
orcid_secret ORCID API key
repec_token RePEc API token

Download the ChromeDriver webdriver for the Chrome brower to enable use of Selenium. This will be run in a "headless" mode.

For a good (though slightly dated) tutorial for installing and testing Selenium on Ubuntu Linux, see: https://christopher.su/2015/selenium-chromedriver-ubuntu/

Usage

from richcontext import scholapi as rc_scholapi

# initialize the federated API access
schol = rc_scholapi.ScholInfraAPI(config_file="rc.cfg", logger=None)
source = schol.openaire

# search parameters for example publications
title = "Deal or no deal? The prevalence and nutritional quality of price promotions among U.S. food and beverage purchases."

# run it...
if source.has_credentials():
    response = source.title_search(title)

    # report results
    if response.message:
        # error case
        print(response.message)
    else:
        print(response.meta)
        source.report_perf(response.timing)

Testing

First, be sure that you're testing the source and not from an installed library.

Then run unit tests on the APIs for which you have credentials and generate a coverage report:

coverage run -m unittest discover

Then create GitHub issues among the submodules for any failed tests.

Also, you can generate a coverage report and upload that via:

coverage report
bash <(curl -s https://codecov.io/bash) -t @.cc_token

Test coverage reports can be viewed at https://codecov.io/gh/Coleridge-Initiative/RCApi

API Integrations

APIs used to retrieve metadata:

See the coding examples in the test.py unit test for usage patterns per supported API.

Troubleshooting

  • ChromeDriver

If you encounter an exception about the ChromeDriver version, for example:

selenium.common.exceptions.SessionNotCreatedException: Message: session not created:
  This version of ChromeDriver only supports Chrome version 78

Then check your instance of the Chrome browser to find its release number, then go to https://chromedriver.chromium.org/downloads to download the corresponding required version of ChromeDriver.

Literature

For more background about open access publications see:

Piwowar H, Priem J, Larivière V, Alperin JP, Matthias L, Norlander B, Farley A, West J, Haustein S. 2017.
The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles
PeerJ Preprints 5:e3119v1
https://doi.org/10.7287/peerj.preprints.3119v1

Contributions

If you'd like to contribute, please see our listings of good first issues.

For info about joining the AI team working on Rich Context, see https://github.com/Coleridge-Initiative/RCGraph/blob/master/SKILLS.md

Kudos

Contributors: @ceteri, @IanMulvany, @srand525, @ernestogimeno, @lobodemonte, plus many thanks for the inspiring 2019 Rich Context Workshop notes by @metasj, and guidance from @claytonrsh, @Juliaingridlane.