Skip to content

Releases: openzim/zimit

2.0.4

15 Jul 08:55
fbd01a7
Compare
Choose a tag to compare

Changed

  • Upgraded Browsertrix Crawler to 1.2.4 (fixes retrieve automatically the assets present in a data-xxx tag #316)

2.0.3

24 Jun 07:51
e8995a9
Compare
Choose a tag to compare

Changed

  • Upgraded Browsertrix Crawler to 1.2.0 (fixes Youtube videos issue #323)

2.0.2

18 Jun 14:00
b73a3e0
Compare
Choose a tag to compare

Changed

  • Upgrade dependencies (mainly warc2zim 2.0.2)

2.0.1

13 Jun 11:33
2835c7b
Compare
Choose a tag to compare

Changed

  • Upgrade dependencies (especially warc2zim 2.0.1 and browsertrix crawler 1.2.0-beta.0) (#318)

Fixed

  • Crawler is not correctly checking disk size / usage (#305)

2.0.0

04 Jun 07:35
d8e6d55
Compare
Choose a tag to compare

Added

  • New --version flag to display Zimit version (#234)
  • New --logging flag to adjust Browsertrix Crawler logging (#273)
  • Use new --scraper-suffix flag of warc2zim to enhance ZIM "Scraper" metadata (#275)
  • New --noMobileDevice CLI argument
  • Publish Docker image for linux/arm64 (in addition to linux/amd64) (#178)

Changed

  • Use warc2zim version 2, which works without Service Worker anymore (#193)
  • Upgraded Browsertrix Crawler to 1.1.3
  • Adopt Python bootstrap conventions
  • Upgrade to Python 3.12 + upgrade dependencies
  • Removed handling of redirects by zimit, they are handled by browsertrix crawler and detected properly by warc2zim (#284)
  • Drop initial check of URL in Python (#256)
  • --userAgent CLI argument overrides again the --userAgentSuffix and --adminEmail values
  • --userAgent CLI argument is not mandatory anymore

1.6.3

18 Jan 08:14
19b4898
Compare
Choose a tag to compare

Changed

  • Adapt to new warc2zim code structure
  • Using browsertrix-crawler 0.12.4
  • Using warc2zim 1.5.5

Added

  • New --build parameter (optional) to specify the directory holding Browsertrix files ; if not set, --output
    directory is used ; zimit creates one subdir of this folder per invocation to isolate datasets ; subdir is kept only
    if --keep is set.

Fixed

  • --collection parameter was not working (#252)

1.6.2

17 Nov 10:25
6e6c0e8
Compare
Choose a tag to compare

Changed

  • Using browsertrix-crawler 0.12.3

Fixed

  • Fix logic passing args to crawler to support value '0' (#245)
  • Fix documentation about Chrome and headless (#248)

1.6.1

06 Nov 09:05
a73114d
Compare
Choose a tag to compare

Changed

  • Using browsertrix-crawler 0.12.1

1.6.0

02 Nov 19:57
9e91406
Compare
Choose a tag to compare

Changed

  • Scraper fails for all HTTP error codes returned when checking URL at startup (#223)
  • User-Agent now has a default value (#228)
  • Manipulation of spaces with UA suffix and adminEmail has been modified
  • Same User-Agent is used for check_url (Python) and Browsertrix crawler (#227)
  • Using browsertrix-crawler 0.12.0

1.5.3

04 Oct 08:52
0005145
Compare
Choose a tag to compare

Changed

  • Using browsertrix-crawler 0.11.2