Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EN] Collect Armenia related data from Europeana #4

Open
ivbeg opened this issue May 26, 2023 · 0 comments
Open

[EN] Collect Armenia related data from Europeana #4

ivbeg opened this issue May 26, 2023 · 0 comments
Labels
extraction Task that require data extraction (scraping) skills parsing Tasks that require data parsing topic-culture Tasks dedicatated Armenian culture, language and history

Comments

@ivbeg
Copy link
Contributor

ivbeg commented May 26, 2023

Goal

The goal is to create a dataset about Armenian cultural heritage collected in European museums and available via the Europeana project (https://www.europeana.eu). This information system provides public API and could be used to create datasets.

Tasks

You must create a list of Armenia-related words and topics in English and use Europeana API to extract this data from the Europeana website and download images associated with each record. Example of such search https://www.europeana.eu/en/search?view=grid&query=Armenia&page=1

Behind this search is an API call https://api.europeana.eu/record/search.json?wskey=nLbaXYaiH&view=grid&query=Armenia&page=2&qf=contentTier%3A%281%20OR%202%20OR%203%20OR%204%29&profile=minimal&rows=24&start=25 and Europeana provides comprehensive API documentation https://pro.europeana.eu/page/apis

Please save collected data as JSON or JSON line files and duplicate records since some search requests could return duplicate entries.
Extracted image files are stored in temporary storage. Please provide the link to the Open Data Armenia team can move it to permanent storage.

Context

Europeana provides detailed API documentation, so extracting the data shouldn't be very hard. Better results will be achieved by searching and collecting Armenian-related words and key phrases.

Requirements

  • create a public GitHub repository to store code and data under one of the free and open licenses like Creative Commons license or MIT license

Wishes

Please write your code as reusable code that could be launched by someone else later since we could need to update this dataset later.

Resources

Prepared by

This task was prepared by the Open Data Armenia team

@ivbeg ivbeg added parsing Tasks that require data parsing extraction Task that require data extraction (scraping) skills topic-culture Tasks dedicatated Armenian culture, language and history labels May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extraction Task that require data extraction (scraping) skills parsing Tasks that require data parsing topic-culture Tasks dedicatated Armenian culture, language and history
Projects
None yet
Development

No branches or pull requests

1 participant