Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

materialize/transactor: cancel load before Destroy() if commit fails #1287

Merged
merged 2 commits into from
Nov 21, 2023

Conversation

mdibaiee
Copy link
Member

@mdibaiee mdibaiee commented Nov 16, 2023

Description:

  • Currently there is a potential for deadlock if the transactor.Destroy() of a connector depends on cancellation of the Load phase. For example, this is the case for sqlserver and snowflake which wait for the database connection established for the load phase to be closed, and this closing of the connection blocks until existing operations are terminated. This can be reproduced by having a connector error during commit phase, while still processing its load phase (See this commit which emulates this situation).
  • This happens as we do not cancel the load phase's context when there is a failure in the commit stage, this means any pending work in the load phase will never be terminated.
  • This pull-request addresses this by allowing us to cancel the context used by the LoadIterator, so we can cancel it when an error is raised during commit phase

Test steps:

This has been tested by:

  1. First reproducing the original issue by running a modified version of sqlserver which raises an error during commit phase, and ensures the Load phase is not terminated before this error is raised. This reliably reproduces the deadlock.
  2. Then, the modified connector was updated to use this branch's transactor implementation, and the deadlock no longer happens. The load phase's context is cancelled and the error propagates up as expected, terminating the connector.

Documentation links affected:

(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)

Notes for reviewers:

(anything that might help someone review this PR)


This change is Reviewable

@mdibaiee mdibaiee force-pushed the mahdi/transactor-cancel-load branch 3 times, most recently from 6c435b0 to eb4a322 Compare November 17, 2023 12:23
@mdibaiee mdibaiee force-pushed the mahdi/transactor-cancel-load branch from eb4a322 to 9a1bb2b Compare November 17, 2023 12:26
@mdibaiee mdibaiee marked this pull request as ready for review November 17, 2023 12:29
Copy link
Member

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % one nit.
I really appreciated the well-written PR description 👍

func (it *LoadIterator) Context() context.Context { return it.stream.Context() }
func (it *LoadIterator) Context() context.Context {
if it.ctx == nil {
it.ctx, it.cancel = context.WithCancel(it.stream.Context())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It feels a bit "tricky" to me to lazily construct the context. IMO it would be better to initialize it when the LoadIterator is constructed in transactor.go. Technically, someone could call Cancel() before calling Context(), and it would be preferable for Context() to return an already-cancelled context in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psFried refactored it

Copy link
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too

@mdibaiee mdibaiee merged commit f8ecef8 into master Nov 21, 2023
3 checks passed
@mdibaiee mdibaiee deleted the mahdi/transactor-cancel-load branch November 21, 2023 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants