Skip to content

Releases: snowplow/snowplow-rdb-loader

3.0.0

01 Apr 15:16
Compare
Choose a tag to compare

Loader

  • Add Snowflake support (#792)
  • Support loading wide row (#791)
  • Extract redshift loader into a separate module (#790)
  • Modularize configuration to support multiple destinations (#789)

Transformer

  • Rename shredders to transformers (#793)
  • Batch: add invalid timestamp check (#652)
  • Batch: transform events to wide row (#649)
  • Kinesis: add invalid timestamp check (#659)
  • Kinesis: transform events to wide row (#650)
  • Batch: make it possible to disable spark caching via config (#808)
  • Batch: remove event validation (#805)

2.2.0

11 Mar 11:49
Compare
Choose a tag to compare

New configuration options, improvements to observability

RDB Loader

  • Stop consuming SQS messages when Loader is busy (#746)
  • Don't enqueue folders into retry queue if the queue is non-empty (#744)
  • Mention amount of retries in retry logic (#745)
  • Make retry behaviour configurable (#742)
  • Expose an actual AWS exception in readKey (#740)
  • Fix not respecting no-op schedule if app starts in a window (#724)
  • Send health check data to statsd (#700)
  • Add webhook setting to reference config file (#713)
  • Make all timeouts configurable (#624)
  • Add loading timeout (#668)
  • Send total attempts in load_succeeded (#717)
  • Fix Retry Queue dropping after first attempt (#716)
  • Make sure health check sends only one alarm (#733)

RDB Shredder

  • Introduce since and until options (#570)

Common

  • Bump aws-java-sdk to 1.12.161 (#736)
  • Bump jackson-dataformat-cbor to 2.12.6 (#550)
  • Bump sbt to 1.6.2 (#735)

Stream Shredder

  • Bump kafka-clients to 2.7.2 (#732)
  • Bump commons-io to 2.7 (#731)
  • Bump protobuf-java to 3.16.1 (#730)

2.1.0

19 Jan 19:38
Compare
Choose a tag to compare

Multiple stability and observability improvements in RDB Loader and performance optimizations in RDB Shredder.

Loader

  • Track when a load succeeded (#574)
  • Add DB health monitoring (#656)
  • Add retry queue (#655)
  • Switch to HikariCP (#654)
  • Handle the whole loading within a same transaction (#646)
  • Use statsd counter instead of gauge for event counts (#523)
  • Remove 'steps' setting (#626)
  • Add no-op schedule (#599)
  • Unify monitoring (#576)

Shredder

  • Allow configuring deduplication (#583)
  • Optimize DAG by excluding count (#582)

2.0.0

01 Dec 13:55
Compare
Choose a tag to compare

This release adds new ability to Batch Shredder and Stream Shredder to send shredding_complete.json message to SNS. Also, we refactored the config structure to split configs of RDB Loader and RDB Shredder.

Additionally we bumped versions of the some of the libraries and deprecated additional steps feature of RDB Loader since there are better alternative mechanisms and they are not useful anymore.

Common

  • Common: split shredder and loader config (close #596)

Shredder

  • Batch Shredder: send shredding_complete to SNS (#595)
  • Stream Shredder: send shredding_complete to SNS (#616)

Loader

  • RDB Loader: deprecate steps (close #625)

Dependency bumps

  • Common: bump aws-java-sdk from 1.11.1019 to 1.12.31 (#613)
  • Common: bump jackson-scala-module to 2.12.3 (#566)
  • Common: bump jackson-databind to 2.12.3 (#614)
  • Common: bump aws-java-sdk-v2 from 2.16.23 to 2.17.59 (#615)

Under the hood

  • Common: use sbt-dynver plugin (close #610)

1.2.3

22 Nov 23:49
Compare
Choose a tag to compare

Stability improvement release.

Bug fixes

  • RDB Loader: don't allow folder monitoring to crash the loader (#628)
  • RDB Loader: make sure folder monitoring cannot execute concurrently (#627)

Enhancements

  • Common: bump schema-ddl to 0.14.3 (#632)
  • RDB Loader: notify about failure outside of loading (#636)

1.2.2

12 Nov 00:35
Compare
Choose a tag to compare

Bugfix release, improving SQS message handling.

RDB Loader

  • Remove logging class from a message (#623)
  • Add until option to folder monitoring (#620)
  • Extend SQS messages visibility timeout during loading (#608)

1.2.1

20 Oct 14:43
Compare
Choose a tag to compare

Bugfix release.

RDB Loader

  • Add since option to folder monitoring (#600)
  • Drop a root folder from folders monitoring (#602)

1.2.0

03 Sep 08:38
Compare
Choose a tag to compare

Major monitoring improvements.

Common

  • Bump sbt-scoverage from 1.6.1 to 1.8.2 (#487)
  • Bump scala-library from 2.12.12 to 2.12.14 (#486)
  • Bump sbt-coveralls from 1.2.7 to 1.3.1 (#516)
  • Bump sbt-tpolecat from 0.1.14 to 0.1.20 (#490)
  • Bump decline from 1.4.0 to 2.1.0 (#529)
  • Bump sbt from 1.5.2 to 1.5.5 (#534)
  • Bump http4s-blaze-client from 0.21.21 to 0.21.25 (#538)
  • Bump slf4j-simple from 1.7.30 to 1.7.32 (#540)
  • Bump iglu-scala-client to 1.1.1 (#541)
  • Bump schema-ddl to 0.14.1 (#536)

RDB Loader

  • Split migration into pre-transaction and in-transaction statements (#548)
  • Change docker base image to adoptopenjdk:11-jre-hotspot-focal (#543)
  • Bump doobie-core from 0.12.1 to 0.13.4 (#474)
  • Bump redshift-jdbc42-no-awssdk from 1.2.54.1082 to 1.2.55.1083 (#535)
  • Clarify error message when connection acquistion has failed (#525)
  • Add monitoring for unloaded and corrupted runs (#457)
  • Add webhook-based alarming (#458)
  • Manage several parallel connections (#537)

RDB Shredder

  • Bump spark-core to 3.1.1 (#544)n
  • Integrate Sentry (#510)
  • Skip CrossBatchDeduplicationSpec (#462)

1.1.1

05 Aug 22:07
Compare
Choose a tag to compare

UX improvements

  • RDB Loader: don't raise duplicate SQS message as an error (#542)

1.1.0

08 Jun 13:33
Compare
Choose a tag to compare

Metrics

This release adds metrics to the loader.

Each time the loader performs a load into Redshift (after reading a message from SQS), these metrics are exported :

  • latency_collector_to_load_min: delay between maximum collector timestamp and load timestamp in seconds
  • latency_collector_to_load_max: delay between minimum collector timestamp and load timestamp in seconds
  • latency_shredder_start_to_load: delay between start of shredder and load timestamp in seconds
  • latency_shredder_end_to_load: delay between end of shredder and load timestamp in seconds
  • count_good: number of good events loaded

These metrics can be exported to 2 different places (possibly at the same time):

  • A statsd server
  • In the logs, with one line per metric

This can be configured with this new part of the configuration file.

Under the hood

  • The loader now uses slf4j for logging. For instance it's now possible to set the log level by adding -Dorg.slf4j.simpleLogger.defaultLogLevel=ERROR to JAVA_OPTS when running the Docker image.
  • The jars are now automatically attached to a release on Github.