Skip to content
@impresso

Media Monitoring of the Past

Media Monitoring of the Past - Beyond Borders: Connecting Historical Newspapers and Radio.

Impresso Project Logo

About

Hi there 👋 !

Impresso - Media Monitoring of the Past is an interdisciplinary research project that uses machine learning to pursue a paradigm shift in the processing, semantic enrichment, representation, exploration and study of historical media across modalities, temporal, linguistic, and national borders. The project has received two rounds of funding, from 2017-2020 and 2023-2027 (hence, there is code from both periods).

We design and develop the Impresso Web App and the upcoming Impresso Datalab (coming soon), while conducting research at the intersection of Natural Language Processing, Design, and History. Find more details on the project website.

Contents

This GitHub organization hosts numerous repositories dedicated to:

  • the code behind the Web App and Datalab. While a few repositories are public, many are still private. We aim to document and release code properly as it matures and becomes ready;
  • code supporting research efforts;
  • code from student projects.

More information and highlights will be shared as we continue to make progress! In addition to the public repositories listed below, you can also check out our models on the Impresso Hugging Face organisation.

Impresso 2 release history

(to come)

Popular repositories Loading

  1. named-entity-tutorial-dh2019 named-entity-tutorial-dh2019 Public

    Tutorial on NE processing for Digital Humanities - DH Utrech 2019

    Jupyter Notebook 25 4

  2. CLEF-HIPE-2020 CLEF-HIPE-2020 Public

    Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at CLEF 2020.

    SCSS 22 5

  3. llm-transcript-postcorrection llm-transcript-postcorrection Public

    A repository for preliminary work on HTR/OCR/ASR post-correction based on GPT models.

    Jupyter Notebook 10 1

  4. NZZ-black-letter-ground-truth NZZ-black-letter-ground-truth Public

    8 1

  5. impresso-text-acquisition impresso-text-acquisition Public

    🛠️ Python library to import OCR data in various formats into the canonical JSON format defined by the Impresso project.

    Jupyter Notebook 7 2

  6. impresso-frontend impresso-frontend Public

    🚀 The frontend application of the Impresso WebApp http://impresso-project.ch/app

    Vue 5

Repositories

Showing 10 of 45 repositories

Top languages

Loading…

Most used topics

Loading…