Skip to content

Latest commit

 

History

History
168 lines (105 loc) · 7.11 KB

CHANGELOG.md

File metadata and controls

168 lines (105 loc) · 7.11 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[Unreleased]

Added

Fixed

  • Fix a bug of not allowed to set timeout option per request.
  • Fix a bug of crawling twice if one url has a trailing slash on the root folder and the other does not.

[1.4.0] - 2018-02-24

Added

  • Support browserCache for crawler.queue()'s options.
  • Support depthPriority option again.

[1.3.4] - 2018-02-22

changed

[1.3.3] - 2018-02-21

Added

  • Emit newpage event.
  • Support deniedDomains and depthPriority for crawler.queue()'s options.

changed

  • Allow allowedDomains option to accept a list of regular expressions.

[1.3.2] - 2018-01-19

Added

Fixed

  • Fix a bug of not showing console message properly.

[1.3.1] - 2018-01-14

Fixed

  • Fix a bug of listing response properties as methods.
  • Fix a bug of not obeying robots.txt.

[1.3.0] - 2018-01-12

Added

changed

[1.2.5] - 2018-01-03

Added

changed

  • Make cache to be required for HCCrawler.connect() and HCCrawler.launch()'s options.
  • Provide skipDuplicates to remember and skip duplicate URLs, instead of passing null to cache option.
  • Modify BaseCache interface.

[1.2.4] - 2017-12-25

Added

  • Support CSV and JSON Lines formats for exporting results
  • Emit requeststarted, requestskipped, requestfinished, requestfailed, maxdepthreached, maxrequestreached and disconnected events.
  • Improve debug logs by tracing public APIs and events.

Changed

  • Allow onSuccess and evaluatePage options as null.
  • Change crawler.isPaused, crawler.queueSize, crawler.pendingQueueSize and crawler.requestedCount from read-only properties to methods.

Fixed

  • Fix a bug of ignoring maxDepth option.

[1.2.3] - 2017-12-17

Changed

  • Refactor by changing tye style of requiring cache directory.

Fixed

  • Fix a bug of starting too many crawlers more than maxConcurrency when requests fail.

[1.2.2] - 2017-12-16

Added

  • Automatically collect and follow links found in the requested page.
  • Support maxDepth for crawler.queue()'s options.

[1.2.1] - 2017-12-13

Added

[1.2.0] - 2017-12-11

Changed

[1.1.2] - 2017-12-10

Added

[1.1.1] - 2017-12-09

Added

Changed

  • Automatically dismisses dialog.
  • Performance improvement by setting a page parallel.

[1.1.0] - 2017-12-08

Added

Changed

  • Public API to launch a browser has changed. Now you can launch browser by HCCrawler.launch().
  • Rename shouldRequest to preRequest for crawler.queue()'s options.
  • Refactor by separating HCCrawler and Crawler classes.
  • Refactor handlers for options.

[1.0.0] - 2017-12-05

Added

Changed

  • Migrate from NPM to Yarn.
  • Refactor helper to class static method style.