Skip to content

Get Event Information

Gregor Leban edited this page Mar 20, 2019 · 11 revisions

When you want to find detailed information about a specific event you can use the QueryEventArticlesIter and QueryEvent classes. The classes can be used to obtain all the information that is in Event Registry shown on the event page (i.e. http://eventregistry.org/event/eng-2940883).

QueryEventArticlesIter

The QueryEventArticlesIter class is a helper class that allows one to quickly obtain the list of articles that are associated with a particular event.

Example of usage

A simple example that will list all English articles about event eng-2940883 is as follows:

from eventregistry import *
er = EventRegistry(apiKey = YOUR_API_KEY)
iter = QueryEventArticlesIter("eng-2940883", lang = "eng")
for art in iter.execQuery(er, sortBy = "date"):
    print art

QueryEventArticlesIter constructor accepts the following arguments:

QueryEventArticlesIter(eventUri,
    lang = None,
    keywords = None,
    conceptUri = None,
    categoryUri = None,
    sourceUri = None,
    sourceLocationUri = None,
    sourceGroupUri = None,
    authorUri = None,
    locationUri = None,
    dateStart = None,
    dateEnd = None,
    dateMentionStart = None,
    dateMentionEnd = None,
    keywordsLoc = "body",

    startSourceRankPercentile = 0,
    endSourceRankPercentile = 100)
  • eventUri: the event URI from which we want to obtain news articles.

  • lang: return articles that are written in the specified language. If more than one language is specified, resulting articles should be written in any of the languages.

  • keywords: limit the event articles to those that mention the specified keywords. A single keyword/phrase can be provided as a string, multiple keywords/phrases can be provided as a list of strings. Use QueryItems.AND() if all provided keywords/phrases should be mentioned, or QueryItems.OR() if any of the keywords/phrases should be mentioned. or QueryItems.OR() to specify a list of keywords where any of the keywords have to appear

  • conceptUri: limit the event articles to those where the concept with concept URI is mentioned. A single concept URI can be provided as a string, multiple concept URIs can be provided as a list of strings. Use QueryItems.AND() if all provided concepts should be mentioned, or QueryItems.OR() if any of the concepts should be mentioned. To obtain a concept URI using a concept label use EventRegistry.getConceptUri().

  • categoryUri: limit the event articles to those that are assigned into a particular category. A single category can be provided as a string, while multiple categories can be provided as a list in QueryItems.AND() or QueryItems.OR(). A category URI can be obtained from a category name using EventRegistry.getCategoryUri().

  • sourceUri: limit the event articles to those that were written by a news source sourceUri. If multiple sources should be considered, use QueryItems.OR() to provide a list of sources. Source URI for a given news source name can be obtained using EventRegistry.getNewsSourceUri().

  • sourceLocationUri: limit the event articles to those that were written by news sources located in the given geographic location. If multiple source locations are provided, then put them into a list inside QueryItems.OR() Location URI can either be a city or a country. Location URI for a given name can be obtained using EventRegistry.getLocationUri().

  • sourceGroupUri: limit the event articles to those that were written by news sources that are assigned to the specified source group. If multiple source groups are provided, then put them into a list inside QueryItems.OR() Source group URI for a given name can be obtained using EventRegistry.getSourceGroupUri().

  • authorUri: find articles that were written by a specific author. If multiple authors should be considered, use QueryItems.OR() to provide a list of authors. Author URI for a given author name can be obtained using EventRegistry.getAuthorUri().

  • locationUri: find articles that describe something that occurred at a particular location. The value can be a string or a list of strings provided in QueryItems.OR(). Location URI can either be a city or a country. Location URI for a given name can be obtained using EventRegistry.getLocationUri().

  • dateStart: find articles that were written on or after dateStart. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime.

  • dateEnd: find articles that occurred before or on dateEnd. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime.

  • dateMentionStart: limit the event articles to those that explicitly mention a date that is equal or greater than dateMentionStart.

  • dateMentionEnd: limit the event articles to those that explicitly mention a date that is lower or equal to dateMentionEnd.

  • keywordsLoc: where should we look when searching using the keywords provided by "keywords" parameter. "body" (default), "title", or "body,title"

  • startSourceRankPercentile: starting percentile of the sources to consider in the results (default: 0). The value should be in range 0-100 and divisible by 10.

  • endSourceRankPercentile: ending percentile of the sources to consider in the results (default: 100). The value should be in range 0-100 and divisible by 10.

  • sortBy: how should the articles be sorted before we decide which ones to return. Options: id (internal id), date (published date), cosSim (closeness to event centroid), socialScore (total shares in social media).

  • returnInfo: sets the properties of various types of data that is returned (articles, concepts, categories, news sources, ...)

Methods

The class has two main methods: count() and execQuery().

count(er)

count(er) method simply returns the number of articles assigned to the event that are in the specified language(s). Expected arguments are:

  • er is the instance of the EventRegistry class.
execQuery(er,

    sortBy = "cosSim", sortByAsc = False,
    returnInfo = ReturnInfo(articleInfo = ArticleInfoFlags(bodyLen = -1)),
    maxItems = -1)

The execQuery returns an iterator over the articles in the event. The meaning of the parameters the execQuery method accepts is as follows: Most commonly, the parameters set are the er, sortBy, returnInfo and potentially maxItems, but the class also supports limiting results to a subset of articles about the event by specifying parameters that are otherwise available when searching for articles in general. A full list of parameters is described below:

  • er: an instance of EventRegistry class that should be used to obtain the necessary data.
  • sortBy: the order in which event articles are sorted. Options: id (internal id), date (published date), cosSim (closeness to event centroid), sourceImportance (importance of the news source), socialScore (total shares in social media).
  • sortByAsc: should the results be sorted in ascending order (True) or descending (False).
  • returnInfo: what details should be included in the returned information. See details.
  • maxItems: max number of about the event to return by the iterator. Use default (-1) to simply return all the articles.

QueryEvent

The QueryEvent class provides a more extended set of functionalities for a given event. The class can be used to obtain not only the list of associated articles but also core event information, a timeline of reporting about the event, list of top news sources reporting about the event, related events, etc.

Example of usage

To start, let us look at a simple example of usage of the QueryEvent() class to obtain information about event with URI eng-2940883:

from eventregistry import *
er = EventRegistry(apiKey = YOUR_API_KEY)
# we are interested in event with URI eng-2940883
q = QueryEvent("eng-2940883")
# get core event information (location, date, top concepts, ...)
q.setRequestedResult(RequestEventInfo())
res = er.execQuery(q)

The resulting JSON object contained in res will contain:

{
    "eng-2940883": {
        "info": { ... },    // details about the event
    }
}

The returned information about articles in the event follows the Article data model.

QueryEvent constructor accepts a single argument eventUriOrList:

QueryEvent(eventUriOrList,
    requestedResult = None)
  • eventUriOrList: can be a string representing a single event URI or it can be a list of event URIs (at most 50). For all requested results except RequestEventInfo(), only a single event URI can be provided.
  • requestedResult: the information about the event to return. Can be any of the RequestEvent* classes described below. If None, then the RequestEventInfo() instance will be set.

Returned information

QueryEvent class provides a method setRequestedResult() that can be used to specify which details about the event you wish to obtain. The argument in the method call has to be an instance that has a base class RequestEvent. Below are the classes that can be specified in the setRequestedResult() calls:

RequestEventInfo

RequestEventInfo(returnInfo = ReturnInfo())

RequestEventInfo class can provide the core information about the event - the title, summary, location, date, concepts, categories and the number of articles reporting about the event.

  • returnInfo: sets the properties of various types of data that is returned (event details, concepts, categories, news sources, ...)

RequestEventArticles

RequestEventArticles(page = 1,
    count = 100,

    lang = None,
    keywords = None,
    conceptUri = None,
    categoryUri = None,
    sourceUri = None,
    sourceLocationUri = None,
    sourceGroupUri = None,
    authorUri = None,
    locationUri = None,
    dateStart = None,
    dateEnd = None,
    dateMentionStart = None,
    dateMentionEnd = None,
    keywordsLoc = "body",

    startSourceRankPercentile = 0,
    endSourceRankPercentile = 100,

    sortBy = "cosSim", sortByAsc = False,
    returnInfo = ReturnInfo())

RequestEventArticles returns details about the articles assigned to the event. Most commonly, you only need to set the page, count, sortBy and returnInfo parameters, but the class also supports limiting results to a subset of articles by specifying parameters that are otherwise available when searching for articles in general. Full list of parameters is described below:

  • page: which page of the articles to return (starting from 1).

  • count: number of articles to return (max 100).

  • lang: return articles that are written in the specified language. If more than one language is specified, resulting articles should be written in any of the languages.

  • keywords: limit the event articles to those that mention the specified keywords. A single keyword/phrase can be provided as a string, multiple keywords/phrases can be provided as a list of strings. Use QueryItems.AND() if all provided keywords/phrases should be mentioned, or QueryItems.OR() if any of the keywords/phrases should be mentioned. or QueryItems.OR() to specify a list of keywords where any of the keywords have to appear

  • conceptUri: limit the event articles to those where the concept with concept URI is mentioned. A single concept URI can be provided as a string, multiple concept URIs can be provided as a list of strings. Use QueryItems.AND() if all provided concepts should be mentioned, or QueryItems.OR() if any of the concepts should be mentioned. To obtain a concept URI using a concept label use EventRegistry.getConceptUri().

  • categoryUri: limit the event articles to those that are assigned into a particular category. A single category can be provided as a string, while multiple categories can be provided as a list in QueryItems.AND() or QueryItems.OR(). A category URI can be obtained from a category name using EventRegistry.getCategoryUri().

  • sourceUri: limit the event articles to those that were written by a news source sourceUri. If multiple sources should be considered, use QueryItems.OR() to provide a list of sources. Source URI for a given news source name can be obtained using EventRegistry.getNewsSourceUri().

  • sourceLocationUri: limit the event articles to those that were written by news sources located in the given geographic location. If multiple source locations are provided, then put them into a list inside QueryItems.OR() Location URI can either be a city or a country. Location URI for a given name can be obtained using EventRegistry.getLocationUri().

  • sourceGroupUri: limit the event articles to those that were written by news sources that are assigned to the specified source group. If multiple source groups are provided, then put them into a list inside QueryItems.OR() Source group URI for a given name can be obtained using EventRegistry.getSourceGroupUri().

  • authorUri: find articles that were written by a specific author. If multiple authors should be considered, use QueryItems.OR() to provide a list of authors. Author URI for a given author name can be obtained using EventRegistry.getAuthorUri().

  • locationUri: find articles that describe something that occurred at a particular location. The value can be a string or a list of strings provided in QueryItems.OR(). Location URI can either be a city or a country. Location URI for a given name can be obtained using EventRegistry.getLocationUri().

  • dateStart: find articles that were written on or after dateStart. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime.

  • dateEnd: find articles that occurred before or on dateEnd. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime.

  • dateMentionStart: limit the event articles to those that explicitly mention a date that is equal or greater than dateMentionStart.

  • dateMentionEnd: limit the event articles to those that explicitly mention a date that is lower or equal to dateMentionEnd.

  • keywordsLoc: where should we look when searching using the keywords provided by "keywords" parameter. "body" (default), "title", or "body,title"

  • startSourceRankPercentile: starting percentile of the sources to consider in the results (default: 0). The value should be in range 0-100 and divisible by 10.

  • endSourceRankPercentile: ending percentile of the sources to consider in the results (default: 100). The value should be in range 0-100 and divisible by 10.

  • sortBy: how should the articles be sorted before we decide which ones to return. Options: id (internal id), date (published date), cosSim (closeness to event centroid), socialScore (total shares in social media).

  • returnInfo: sets the properties of various types of data that is returned (articles, concepts, categories, news sources, ...)

RequestEventArticleUriWgts

RequestEventArticleUriWgts(lang = None,
    sortBy = "cosSim", sortByAsc = False)

RequestEventArticleUriWgts returns a simple list of article URIs for articles that are assigned to the event.

  • lang: languages in which should the returned articles be. If None is used, then articles in the event will not be filtered by language.
  • sortBy and sortByAsc parameters determine in which order should the URIs be returned.

RequestEventKeywordAggr

RequestEventKeywordAggr(lang = "eng")

RequestEventKeywordAggr returns top keywords extracted from articles in the event

  • lang: if not None then the top keywords will only be computed from the articles in the specified language.

RequestEventSourceAggr

RequestEventSourceAggr returns the information about the news sources that reported about the event. The class does not accept any additional arguments.

RequestEventDateMentionAggr

RequestEventDateMentionAggr returns information about the dates that were mentioned in the articles about the event. The class does not accept any additional arguments.

RequestEventArticleTrend

RequestEventArticleTrend provides a list of core article information that can be used to display how the intensity of reporting about the event has been changing over time.

RequestEventSimilarEvents

RequestEventSimilarEvents(conceptInfoList,
    count = 50,
    maxDayDiff = sys.maxint,
    addArticleTrendInfo = False,
    aggrHours = 6,
    includeSelf = False,
    returnInfo = ReturnInfo()
)

RequestEventSimilarEvents returns a list of events related to the given event.

  • conceptInfoList: array of concepts and their importance, e.g. [{ "uri": "http://en.wikipedia.org/wiki/Barack_Obama", "wgt": 100 }, ...]. The list of at most 20 concepts is used to identify the related events.
  • count determines the number of similar events to return (max 50).
  • maxDayDiff: what is the maximum time difference between the similar events and this one.
  • addArticleTrendInfo: add info how the articles in the similar events are distributed over time.
  • aggrHours: if similarEventsAddArticleTrendInfo == True then this is the aggregating time window.
  • includeSelf: should the info about the event itself also be included among the results?
  • returnInfo: what details should be included in the returned information. See details.