Skip to content

Releases: salsadigitalauorg/merlin-framework

Merlin 1.1.0

28 Oct 04:41
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.0.0...1.1.0

Merlin 1.0.0

28 Oct 03:56
16dcc65
Compare
Choose a tag to compare

What's Changed

  • Improved relative link handling
  • New Group type, docs, tests that didnt make it pre namespace change
  • Introduces allowed_classes filtering. Fixes encoding issues
  • UnwrapLinks processor
  • Add option for referer to fetcher
  • Unescape slashes on json output
  • Support for uuidv3 on group item content and json output to be consumed as paragraphs in Drupal world
  • Allow generic any name of output
  • Optionally use Guzzle redirect info for speed
  • Group crawl by query string
  • Track redirects on crawl in Guzzle
  • General group_uuid instead of paragraph
  • Support for extra media attributes
  • New sub_fetch processor to fetch and process an URL. Nested Merls.
  • Proper check for config and rename entity based on config
  • Use v4 ip resolve for Curl options, a lot faster
  • Resolve robots.txt ignore.
  • Fixed ordered type to emit the field name as well.

Co-authored-by: Andrew Rowlands [email protected]
Co-authored-by: Stuart Rowlands [email protected]
Co-authored-by: Suchi Garg [email protected]
Co-authored-by: Sonny Kieu [email protected]
Co-authored-by: Stuart Rowlands [email protected]

Merlin 0.4.3

28 Oct 03:33
4a329d7
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.4.2...0.4.3

Merlin 0.4.2

27 May 09:11
Compare
Choose a tag to compare
  • #131: Improved relative link handling in Link and MenuLink
  • #130: Support custom request headers in crawler
  • #133: Various bugfixes when including via composer, improved docs + onboarding (credit: @jannakha)

Merlin 0.4.1

09 Apr 21:47
Compare
Choose a tag to compare

0.4.0 Release notes:

  • Added caching support to spider #85
  • Allow cache_dir override
  • Added support for multiple selectors #40
  • Multiple selectors error reporting #110
  • Vastly improve crawler performance at scale #115
  • Added support for optional starting routes for spider #83
  • Added additional crawler options (timeout, connect_timeout, verify, cookies)
  • Added support for separate menu selector #35
  • Better relative URL resolution
  • Exclude external media assets #112
  • Added support for spider inclusion by regex
  • [DRUPAL] Added support for linkit link conversion
  • Ensures uniqueness in logging output files #89
  • Adds runtime --limit flag to limit total number of results
  • Split exported media results by type
  • Added PR template, CI badge, various documentation improvements
  • Various bugfixes #81 #87 #104 #105

0.4.1 bugfix release:

  • Re-added missing binary
  • Fixed broken links in docs

Contributors

  • Stuart Rowlands
  • Andy Rowlands
  • Steve Worley
  • Nick Georgiou
  • Alex Skrypnyk

Merlin 0.3.0

03 Sep 03:32
d944c25
Compare
Choose a tag to compare
  • Feature: Adds crawler to generate URL list and group by criteria (DOM or path regex)
  • Feature: Local caching layer on runs
  • Feature: Javascript support
  • Feature: Sub-field processing
  • Feature: Mandatory fields
  • Feature: Support loading URL list from separate file
  • Feature: Crawled URLs can be merged directly into config files
  • Feature: Default value support
  • Feature: Support for query parameters and fragments in URLs
  • Bugfix: Validate file permissions on output files
  • Bugfix: Blank attributes caused media processor to fail
  • Bugfix: Ensure error arrays are merged appropriate for logging
  • Bugfix: Ensure same output for both dom + xpath selectors (long_text)
  • Bugfix: Resolve issue with raw selectors
  • Bugfix: Media configuration for data_embed_button and data_entity_embed_display resolved

Contributors

  • Stuart Rowlands
  • Andy Rowlands
  • Steve Worley
  • Nick Georgiou

Merlin 0.2.0

28 Jun 06:54
ec923ed
Compare
Choose a tag to compare

Welcome to the Merlin framework! A configurable and repeatable way to build structured representations of content to assist with migrating content between content management systems.

This initial release provides the base framework to build configuration files and run the program.

Contributors

  • Steve Worley
  • Stuart Rowlands
  • Andy Rowlands

Merlin 0.1.0

23 Jun 10:38
Compare
Choose a tag to compare

Welcome to the Merlin framework! A configurable and repeatable way to build structured representations of content to assist with migrating content between content management systems.

This initial release provides the base framework to build configuration files and run the program.

Contributors

  • Steve Worley
  • Stuart Rowlands
  • Andy Rowlands