[EN] Collect Armenia related data from Europeana #4
Labels
extraction
Task that require data extraction (scraping) skills
parsing
Tasks that require data parsing
topic-culture
Tasks dedicatated Armenian culture, language and history
Goal
The goal is to create a dataset about Armenian cultural heritage collected in European museums and available via the Europeana project (https://www.europeana.eu). This information system provides public API and could be used to create datasets.
Tasks
You must create a list of Armenia-related words and topics in English and use Europeana API to extract this data from the Europeana website and download images associated with each record. Example of such search https://www.europeana.eu/en/search?view=grid&query=Armenia&page=1
Behind this search is an API call https://api.europeana.eu/record/search.json?wskey=nLbaXYaiH&view=grid&query=Armenia&page=2&qf=contentTier%3A%281%20OR%202%20OR%203%20OR%204%29&profile=minimal&rows=24&start=25 and Europeana provides comprehensive API documentation https://pro.europeana.eu/page/apis
Please save collected data as JSON or JSON line files and duplicate records since some search requests could return duplicate entries.
Extracted image files are stored in temporary storage. Please provide the link to the Open Data Armenia team can move it to permanent storage.
Context
Europeana provides detailed API documentation, so extracting the data shouldn't be very hard. Better results will be achieved by searching and collecting Armenian-related words and key phrases.
Requirements
Wishes
Please write your code as reusable code that could be launched by someone else later since we could need to update this dataset later.
Resources
Prepared by
This task was prepared by the Open Data Armenia team
The text was updated successfully, but these errors were encountered: