Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loader: testing with cloud #1285

Open
wants to merge 14 commits into
base: azure/experiment_tests
Choose a base branch
from

Conversation

spenes
Copy link
Contributor

@spenes spenes commented Jul 10, 2023

This PR contains automated tests for Snowflake Loader on Azure.

It brings necessary building blocks to add tests for other destinations and cloud types as well.

Test class structures are constructed similar to transformer automated tests.

@pondzix pondzix force-pushed the azure/experiment_tests branch from 64b5ec2 to e6ced35 Compare July 10, 2023 09:18
@spenes spenes force-pushed the azure/loader_experiment_tests branch from 6ba622e to e3232f3 Compare July 10, 2023 09:34
@spenes spenes changed the title Automated tests for loader loader: testing with cloud Jul 10, 2023
@spenes spenes force-pushed the azure/loader_experiment_tests branch from e3232f3 to bdd6cd5 Compare July 10, 2023 09:41
queueConsumer.read
.map(_.content)
.map(parseShreddingCompleteMessage)
.evalMap(getWindowOutput(_))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So type param A represents here an output of a window produced by transformer. For transformer test scenarios we need more details read from output, whereas for loaders we just need shredding_complete messages.

How about instead of accumulating window outputs, we accumulate shredding_complete messages? Then:

  • for transformer - convert list of accumulated message to output
  • for loader - do nothing, we already have everything we need

Also in the shredding_complete message we already have all the counts needed to terminate test! I think then we don't have to use any type params.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea! I will make necessary changes.


import org.specs2.mutable.Specification

abstract class LoaderSpecification extends Specification with TestDAO.Provider with StorageTargetProvider with AzureTestResources {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be CloudResources instead of AzureTestResources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good spot, I did mistake in there.

def run[A](
inputBatches: List[InputBatch],
countExpectations: CountExpectations,
dbActions: TestDAO => IO[A]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's mostly about quering stuff from DB... can we name it that way? Actions sounds pretty generic ;) Something like: queryDbOutput? And instead of just A type param I would probably go with DB_OUTPUT. WDYT?

testDAO = createDAO(transaction)
} yield TestResources(queueConsumer = consumer, producer = producer, testDAO = testDAO)

def createDbTransaction(implicit secretStore: SecretStore[IO]): Resource[IO, Transaction[IO, ConnectionIO]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we need Transaction[IO, ConnectionIO] type here in tests, with all the pooling, retries, transactions handling etc. Wouldn't simple Transactor.fromDriverManager from doobie be sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use existing way in order to not duplicate the logic in the tests as well but you are right, it is quite complicated to initialize Transaction and we don't use most of the features in the tests. I will try to use Transactor directly 👍

countExpectations,
dbActions = testDAO =>
for {
manifestItems <- retryUntilNonEmpty(testDAO.queryManifest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we wait until there is any manifest item present in the DB. Is that condition correct when we have multiple windows produced by transformer?

How about we slightly modify it: we already have here all the messages produced by the transformer. Could we query DB and wait until we find all matching rows for all produced output folders? Try to match base field from shredding_complete message and base field from manifest item. For all accumulated messages. Would that be doable?

Then I think we don't have clean up the table before tests!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch! I guess it should be possible to implement it like that. I will give it a shot 👍

@pondzix pondzix force-pushed the azure/experiment_tests branch 2 times, most recently from 66fba37 to f574a77 Compare July 12, 2023 10:11
@pondzix pondzix force-pushed the azure/experiment_tests branch 2 times, most recently from ab5a0f7 to df6372d Compare July 26, 2023 08:20
@pondzix pondzix force-pushed the azure/experiment_tests branch 2 times, most recently from 21270eb to b28c19b Compare July 31, 2023 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants