Data models and scraper for museris.lausanne.ch developed in the context of the 2015 Swiss Open Cultural Data Hackathon.
- Create a new virtualenv and install the requirements
pip install git+https://github.com/cruncher/museris.git#egg=museris-dev
- add
numeris_data
to yourINSTALLED_APPS
settings. - import an existing database or
python manage.py migrate
and start scraping
python manage.py scrape_data
to scrape all 180'000+ Objects- or
python manage.py scrape_data <start_id> <end_id>
to scrape a subset of Objects - or
python manage.py scrape_data <ID>
to scrape a single Object
Note: Object images inside DataObjectImage
are not automatically downloaded, only the image URL is recorded. To sync and download the actual images, run python manage.py get_object_images
Defined in models.py
.
Institution
s are basically museumsDataObject
s hold infomration about a single object (paintings, physical objects, …) referenced in the museums. Has a foreign-key to anInstitution
.Person
: represents a Person involved withDataObjects
(authors, artists, photographs, curators, …)DataObjectImage
hold images (urls and actual images) related to a singleDataObject
DataObjectProperty
hold a single key (e.g. "Creation year") and value (e.g. "2001") related to a singleDataObject
PersonProperty
same asDataObjectProperty
but forPerson
s, e.g. "date of birth", "bio", …DataObjectLatLong
hold a latitude / longitude pair for a singleDataObject
, if the object has geographic informationDataObjectPerson
a many-to-many relation betweenDataObject
s andPerson
including a role (e.g. "Author", "Curator", …)
- Images associated to
Person
s are currently not being scraped - Only the first image associated to each
DataObject
is downloaded, even thought there are possibly more.
Most objects and images in the Museris database are covered by copyright and may not be re-used.