Skip to content
Mike Caprio edited this page Oct 3, 2017 · 16 revisions

A System to Transcribe Field Notes from Dinosaur Expeditions

Background

Paleontologists throughout the 20th century used field notebooks to keep detailed logs of their expeditions. Previous work at the museum has given us these notes as scanned images and as very imperfect text transcriptions. The text has never analyzed for potentially relevant pieces of information that could lead to new understanding of past expeditions. These data are very frequently requested by researchers from around the world, but their imperfect nature make them less useful than they could be.

Crowdsourcing systems like Zooniverse have had great success enlisting the general public to the cause of citizen science. We would like to build a web application to allow the public to transcribe field notes (fixing the flawed prior transcriptions), hopefully to have several people assigned to the same scans to corroborate transcriptions. We could also potentially combine multiple data sets from collections to locate species, specimens on maps or otherwise connect scanned in pages of expedition notes to images of specimens or their metadata.

  • Tag year
  • Tag geolocation latitude / longitude

Solutions

  • Devise a user interface which will allow individuals to dynamically zoom in to areas of text and identify "problem areas" for transcriptions.
  • Create a system for recreating and visualizing an expedition - a storytelling challenge. What dinos were found on this expedition? What eras are the discovered fossils from? Think of visualization aids to help understand the expeditions better.
  • Perform NLP / text analysis of field notes, even though the text data is imperfect

Resources