Skip to content

Releases: gwu-libraries/TweetSets

Version 2.2.0

22 Sep 12:41
7c4f064
Compare
Choose a tag to compare
  • Upgrades Python to 3.8 (#126. #131)
  • Upgrades Spark & pyspark to 3.1 (#128, #117)
  • Uses the Spark DataFrame API to create full extracts at time of load: Tweet ID's, full Tweet CSV, Tweet mentions, Tweet users (#128)
  • Re-purposes the original (gzipped) JSONL from SFM to create the full Tweet JSON extract, concatenating the files by date of harvest (#152)
  • Adds an environment variable for specifying maximum file size for full extracts (#128)
  • Updates the TweetSets data model to align with twarc v. 1.12 (#150, #128)
  • Improves the indexing and extraction of full text and hashtags from extended Tweets (#150, #128)
  • Updates tests to test the Spark schema for creating extracts (#135)
  • Prevents access to full dataset files by those not authorized (#148)
  • Installation documentation and docker-compose.yml clarifications (#119, #95, #90)
  • Updates pinning of Elasticsearch dependencies (#141)
  • Bugfixes for using flask create-extract command (#125) and checking whether user should be directed to full dataset (#120)
  • Preventing incorrect date format from being submitted in form (#87)

Version 2.1.0

17 Mar 19:00
Compare
Choose a tag to compare

Changes in this release:

  • Remediates accessibility issues in the UI (#9, #91)
  • Changes source dataset selection to allow only one dataset to be selected (#85)
  • Update help guide to reflect UI chances and improve accessibility (#93)
  • Set up Google Analytics for usage tracking (#55)
  • User email address required for custom extract; user notified upon extract completion (#94)
  • Extract options for top users, top mentions, and mentions disabled (#89)
  • "Top 1000" analytics provided as CSV from dataset statistics page (#88)
  • Requesting a full extract now redirects to download prepared files (#83)
  • Prepared full extracts may be created by command-line utility or by requesting them in the UI (if they don't already exist) (#83)
  • Added an update loader command, which re-reads dataset.json to update the dataset's descriptive metadata (and statistics) (#41)
  • Added GW footer and cookie consent popup (#82)
  • Wording improvements in the UI (#80, #81)
  • Added lang field to indexing of newly-loaded datasets (#39). UI changes (#114) were not done in this release and should only be done after (most?) data sets have been reloaded.

Version 2.0

24 Nov 19:31
4c8700a
Compare
Choose a tag to compare

Changes in this release:

  • Major version upgrade to ElasticSearch from 6.2.2 to 7.9.2 and to elasticsearch-dsl and elasticsearch-py Python dependencies (#47, #52, #63)
  • Upgrades to other dependencies, including Flask and its dependencies, pyspark, and requests (#54, #60, #31)
  • Alerts user and prevents dataset request if their dataset parameters would produce a dataset of zero tweets (#5)
  • Replaces date picker with jQuery UI date picker for more complete browser support (#4)
  • Makes link to dataset zip files more visible by moving to top of page (#3)
  • Add notice for GW users to use VPN for enhanced access (#48)
  • Clarify terminology in statistics page (#51)
  • Indexes tweet language (#39) but only for tweet dataset loads that occur using v2.1. Datasets loaded previously would need to be reloaded. UI and subsetting functionality will be added later in #114.
  • spark-submit update command to update an existing dataset now also updates metadata (from dataset.json) and statistics (#41)
  • Compliance: Added GW footer including cookie consent (#82)
  • Clarified labels for date/time fields to specify UTC (#80)
  • Clarified wording when subset yields no tweets (#81)
  • Bug fix for statistics showing zero uses instead of actual number (#49)
  • Bug fix for error when pressing Enter on dataset name modal input (#35)
  • Bug fix for unittests for stats (#58)

Version 1.1.1

24 Mar 18:27
da621cf
Compare
Choose a tag to compare

Updates to dependencies.

Version 1.1.0

05 Dec 14:44
d00a377
Compare
Choose a tag to compare

Updates to dependencies, Docker base images, and tests.

Version 1.0.0

14 Jun 12:58
Compare
Choose a tag to compare

An initial release of TweetSets, for the purposes of registering with Zenodo.