Skip to content

openeventdata/PLOVER

Repository files navigation

PLOVER

plover_icon

PLOVER–Political Language Ontology for Verifiable Event Records–is a next generation political event coding specification under development by the Open Event Data Alliance (http://openeventdata.org/) which is intended to replace the earlier [CAMEO] (http://eventdata.parusanalytics.com/data.dir/cameo.html) system.

The full PLOVER codebook is available in the repo above, and a short introduction to PLOVER and event data is below.

There is currently a near-real-time PLOVER-coded global event data set on Dataverse at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/AJGVIT. Extensive details are available at these two open-access papers:

Halterman, Andrew, Philip A. Schrodt, Andreas Beger, Benjamin E. Bagozzi and Grace I. Scarborough. 2023. “Creating Custom Event Data Without Dictionaries: A Bag-of-Tricks.” Working paper presented at the International Studies Association, Montreal, March-2023. arXiv link

Halterman, Andrew, Benjamin E. Bagozzi, Andreas Beger, Philip A. Schrodt, and Grace I. Scarborough. 2023. PLOVER and POLECAT: A New Political Event Ontology and Dataset.” Working paper presented at the International Studies Association, Montreal, March-2023. socArXiv link

Event Data and Ontologies

Event data in political science is a structured way of recording interactions between political actors described in text. For instance, a researcher encountering the sentence

A town in western Sudan's South Kordofan state has been recaptured by Sudanese government forces from the rebel Sudan People's Liberation Army (SPLA). (AFP_ENG_19970408.0772)

might want to represent it as ACTOR = "Sudanese military", EVENT = "capture territory", TARGET = "SPLA", a process that can be applied over hundreds of thousands of reports to create datasets of thousands of such events for answering research questions.

The software to extract and categorize these events exists (see OEDA's Petrarch2, for instance), but all event data systems require an ontology defining what actors events will be recorded and how they will be defined.

Ontologies face difficult tradeoffs: broader groupings of event types and actors are easier to define and implement, are easier to work with, and provide data at a useful level of aggregation, especially analyzed globally. More specialized and specific ontologies will sometimes be necessary for answering certain research questions and allow more granular and subnational study, but are more difficult to implement and may provide distinctions that are not useful to most researchers. In the example above, should the event be categorized as a general "seize" event? Or a more specific "capture territory"? Should the target be the group the land was taken from, or should we think about the territory itself as the target. If the group it was taken from, should the target here be coded as a general "Sudanese rebel" or the very specific "SPLA"?

The best level of detail will depend on the question and resources of the researcher. PLOVER is a new event data ontology that choses to be more general, easier to implement (including across languages), and at the level of detail demanded by most existing users of event data. At the same time, PLOVER defines and provides guidance on making "PLOVER-compliant extensions" that will fit into the ecosystem of tools for creating and analyzing PLOVER event data.

PLOVER event categories

PLOVER defines 18 event types, many of which are aggregations of older CAMEO codes:

CAMEO code CAMEO text PLOVER category
01 MAKE PUBLIC STATEMENT dropped
02 APPEAL dropped
03 EXPRESS INTENT TO COOPERATE AGREE
04 CONSULT CONSULT
05 ENGAGE IN DIPLOMATIC COOPERATION SUPPORT
06 ENGAGE IN MATERIAL COOPERATION COOPERATE
07 PROVIDE AID AID
08 YIELD (081 to 083) CONCEDE
08 YIELD (084 to 087) RETREAT
10 DEMAND DEMAND
11 DISAPPROVE DISAPPROVE
12 REJECT REJECT
13 THREATEN THREATEN
14 PROTEST PROTEST
15 EXHIBIT FORCE POSTURE MOBILIZE
16 REDUCE RELATIONS SANCTION
17 COERCE COERCE
18,19,20 ASSAULT, FIGHT ASSAULT

PLOVER quad categories

Many users of event data aggregate events into four categories in a 2x2 of cooperation--conflict and verbal--material. Those categories are defined in terms of their constituent categories here:

Quad category PLOVER categories
Verbal cooperation AGREE, CONSULT, SUPPORT, CONCEDE
Material cooperation COOPERATE, AID, RETREAT, INVESTIGATE
Verbal conflict DEMAND, DISAPPROVE, REJECT, THREATEN, SANCTION
Material conflict PROTEST, CRIME, MOBILIZE, COERCE, ASSAULT

Migrating From CAMEO

The existing standard ontology for event data is CAMEO. Users who are familiar with CAMEO may be interested in the differences between CAMEO and PLOVER.

  • A set of standardized names ("fields") for JSON records are specified for both the core event data fields and for extended information such as geolocation and extracted texts. Most of these fields are optional but standardized field names will allow for the development of common utilities, which cannot be done with the current proliferation of incompatible CSV and tab-delimited formats.
  • Only the 2-digit event 'cue categories' have been retained from CAMEO: our hope is that these are sufficiently broad and distinct that it will be possible to achieve a reasonably high level of human inter-coder agreement— hence "verifiable"—on the coding categories, and that can be consistently implemented, across all categories, in automated systems. These are defined in much greater detail than they were in WEIS and CAMEO.
  • Much of the detail previously incorporated in the 3- and 4-digit CAMEO categories is now reflected in category-specific "mode" fields and a general "context" field: in effect, this "category-mode-context" scheme covers the "what-how-why" of the event. We anticipate these will be much easier to systematically code than the 250 or so hierarchically-arranged numerical codes of CAMEO. The "context" field also to handles issues such as refugees, disease, natural disaster, elections, parliamentary processes and cyber-security.
  • The CAMEO 01 and 02 categories dealing with comments have been eliminated.
  • A new category has been added for criminal behavior.
  • The WEIS/CAMEO YIELD category has been split into verbal (CONCEDE) and material (RETREAT) components.
  • The complexity of substate actor codes has been limited, and the allowable substate modifiers have been substantially simplified.
  • The "target" is optional in some categories.
  • Both the source and target fields can have compound actors, rather than dealing with compounds by duplicating event.
  • 'dead', 'injured' and 'size' fields are available for recording information on the magnitude of acts of violence and protests.
  • In the near future, we are hoping to make available a large corpus of "gold standard records" for validation purposes: these will include Spanish and Arabic cases as well as English. The current release has a file of English-language gold standard records derived from from the CAMEO manual.

Why "PLOVER"?

Plovers (Charadriidae) are a globally-distributed family of short-billed gregarious wading birds who spend their lives frantically poking through endless stretches of sand and muck trying to find something of interest. It is difficult to imagine a better analogy to the process of coding event data.

Description of files in repository

  • PLOVER_Manual.pdf/.tex

    Version 0.8 draft which is being implemented by the Political Instability Task Force

  • PLOVER.bib

    BibTex entries for the manual

  • plover_reference.html

    Basic reference for the PLOVER event categories in HTML format

  • gold_standard_records/PLOVER_GSR_CAMEO.txt

    This is a JSON file of the example sentences from the CAMEO 1.1b3 manual classified by PLOVER categories. The CAMEO LaTeX markup indicating source, target and event texts has been converted to a simple in-line markup; some locations have been added manually. The details of the coding are found in the files PLOVER_GSR_CAMEO_readme.tex/pdf and the version of PLOVER being coded is in the older file plover-manual_draft.0.6b1.pdf.

  • CAMEO-PLOV.txt

    Translation table for CAMEO to PLOVER: this is still something of a draft but will get you started

Acknowledgments

This program was developed as part of research funded by a U.S. National Science Foundation "Resource Implementations for Data Intensive Research in the Social Behavioral and Economic Sciences (RIDIR)" project: Modernizing Political Event Data for Big Data Social Science Research (Award 1539302)

Any opinions, findings, conclusions or recommendations in this document are [only probably still] those of [at least one of] the authors and do not necessarily reflect the views of the National Science Foundation, or any company or government agency employing or funding the authors or otherwise contributing to the document.

This work was partially supported by the Political Instability Task Force (PITF). The PITF is funded by the Central Intelligence Agency. The views expressed in this codebook are the authors' alone and do not represent the views of the US Government.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.