loader: testing with cloud #1285

spenes · 2023-07-10T08:55:44Z

This PR contains automated tests for Snowflake Loader on Azure.

It brings necessary building blocks to add tests for other destinations and cloud types as well.

Test class structures are constructed similar to transformer automated tests.

…urces

…rces

pondzix · 2023-07-10T13:57:00Z

...est/scala/com.snowplowanalytics.snowplow.rdbloader.common/integrationtestutils/ItUtils.scala

+    queueConsumer.read
+      .map(_.content)
+      .map(parseShreddingCompleteMessage)
+      .evalMap(getWindowOutput(_))


So type param A represents here an output of a window produced by transformer. For transformer test scenarios we need more details read from output, whereas for loaders we just need shredding_complete messages.

How about instead of accumulating window outputs, we accumulate shredding_complete messages? Then:

for transformer - convert list of accumulated message to output

for loader - do nothing, we already have everything we need

Also in the shredding_complete message we already have all the counts needed to terminate test! I think then we don't have to use any type params.

I like this idea! I will make necessary changes.

pondzix · 2023-07-10T13:59:56Z

...c/test/scala/com/snowplowanalytics/snowplow/rdbloader/experimental/LoaderSpecification.scala

+
+import org.specs2.mutable.Specification
+
+abstract class LoaderSpecification extends Specification with TestDAO.Provider with StorageTargetProvider with AzureTestResources {


Shouldn't that be CloudResources instead of AzureTestResources?

Yep, good spot, I did mistake in there.

pondzix · 2023-07-10T14:02:19Z

...c/test/scala/com/snowplowanalytics/snowplow/rdbloader/experimental/LoaderSpecification.scala

+  def run[A](
+    inputBatches: List[InputBatch],
+    countExpectations: CountExpectations,
+    dbActions: TestDAO => IO[A]


As it's mostly about quering stuff from DB... can we name it that way? Actions sounds pretty generic ;) Something like: queryDbOutput? And instead of just A type param I would probably go with DB_OUTPUT. WDYT?

pondzix · 2023-07-10T14:06:30Z

...c/test/scala/com/snowplowanalytics/snowplow/rdbloader/experimental/LoaderSpecification.scala

+      testDAO = createDAO(transaction)
+    } yield TestResources(queueConsumer = consumer, producer = producer, testDAO = testDAO)
+
+  def createDbTransaction(implicit secretStore: SecretStore[IO]): Resource[IO, Transaction[IO, ConnectionIO]] = {


I'm wondering whether we need Transaction[IO, ConnectionIO] type here in tests, with all the pooling, retries, transactions handling etc. Wouldn't simple Transactor.fromDriverManager from doobie be sufficient?

I tried to use existing way in order to not duplicate the logic in the tests as well but you are right, it is quite complicated to initialize Transaction and we don't use most of the features in the tests. I will try to use Transactor directly 👍

pondzix · 2023-07-10T14:11:08Z

...est/scala/com/snowplowanalytics/snowplow/loader/snowflake/it/AzureSnowflakeLoaderSpecs.scala

+                                       countExpectations,
+                                       dbActions = testDAO =>
+                                         for {
+                                           manifestItems <- retryUntilNonEmpty(testDAO.queryManifest)


So we wait until there is any manifest item present in the DB. Is that condition correct when we have multiple windows produced by transformer?

How about we slightly modify it: we already have here all the messages produced by the transformer. Could we query DB and wait until we find all matching rows for all produced output folders? Try to match base field from shredding_complete message and base field from manifest item. For all accumulated messages. Would that be doable?

Then I think we don't have clean up the table before tests!

Oh, good catch! I guess it should be possible to implement it like that. I will give it a shot 👍

pondzix and others added 12 commits June 26, 2023 11:14

Add azure module with kafka + blob storage implementation

87636e9

Add transformer-kafka module

d7b3ecf

transfomer-kafka: blob storage improvements

7cadd05

transformer-kafka: add auth for writing parquet to Azure Data Lake

8ef1f2d

loader: add azure

b889692

Loader: Add temp creds for Azure

ee9bac5

Add transformer-kafka to CI

8a6a995

Fix azure token provider scope

92807bb

Loader: add postProcess to Kafka consumer

7b9cb44

Loader: integrate Azure Key Vault

c336d8c

Path related fixes on Azure Blob Storage

329bf31

Loader: add tests for Azure configs

0ddbe80

snowplowcla added the cla:yes label Jul 10, 2023

transformer-kafka: add semi-automatic test scenerios using cloud reso…

e6ced35

…urces

pondzix force-pushed the azure/experiment_tests branch from 64b5ec2 to e6ced35 Compare July 10, 2023 09:18

spenes force-pushed the azure/loader_experiment_tests branch from 6ba622e to e3232f3 Compare July 10, 2023 09:34

spenes changed the title ~~Automated tests for loader~~ loader: testing with cloud Jul 10, 2023

Snowflake Loader: add semi-automatic test scenerios using cloud resou…

bdd6cd5

…rces

spenes force-pushed the azure/loader_experiment_tests branch from e3232f3 to bdd6cd5 Compare July 10, 2023 09:41

pondzix reviewed Jul 10, 2023

View reviewed changes

pondzix force-pushed the azure/experiment_tests branch 2 times, most recently from 66fba37 to f574a77 Compare July 12, 2023 10:11

pondzix force-pushed the azure/experiment_tests branch 2 times, most recently from ab5a0f7 to df6372d Compare July 26, 2023 08:20

pondzix force-pushed the azure/experiment_tests branch 2 times, most recently from 21270eb to b28c19b Compare July 31, 2023 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loader: testing with cloud #1285

loader: testing with cloud #1285

spenes commented Jul 10, 2023

pondzix Jul 10, 2023

spenes Jul 10, 2023

pondzix Jul 10, 2023

spenes Jul 10, 2023

pondzix Jul 10, 2023

pondzix Jul 10, 2023

spenes Jul 10, 2023

pondzix Jul 10, 2023

spenes Jul 10, 2023


		import org.specs2.mutable.Specification

		abstract class LoaderSpecification extends Specification with TestDAO.Provider with StorageTargetProvider with AzureTestResources {

loader: testing with cloud #1285

Are you sure you want to change the base?

loader: testing with cloud #1285

Conversation

spenes commented Jul 10, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment