Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss: blocking DDL with sink_decouple might have bad UX #18765

Open
xxchan opened this issue Sep 30, 2024 · 4 comments
Open

Discuss: blocking DDL with sink_decouple might have bad UX #18765

xxchan opened this issue Sep 30, 2024 · 4 comments

Comments

@xxchan
Copy link
Member

xxchan commented Sep 30, 2024

Recently we made sink_decouple default to true for all sinks #17095. Previously we also make blocking DDL for sink defaults to true #15587.

Although they are orthogonal features:

  • sink_decouple means writing to log store in SinkExecutor
  • blocking DDL means wait for backfill in StreamScanExecutor

But it feels a little strange to me that when we use sink decouple and blocking DDL at the same time: we will wait for all data is backfilled into log store. In this case, user may not see the results are ready in sinks.

@github-actions github-actions bot added this to the release-2.1 milestone Sep 30, 2024
@fuyufjh
Copy link
Member

fuyufjh commented Oct 17, 2024

What's the current behavior of create sink? Blocking or non-blocking?

@xxchan
Copy link
Member Author

xxchan commented Oct 17, 2024

I think it's controlled by background_ddl, same as mv, so blocking

@fuyufjh
Copy link
Member

fuyufjh commented Oct 17, 2024

I see. IMO, I feel the sink_decouple is mostly for decoupling failures or downstream, and should not affect the behavior of background_ddl.

May post this to Slack for wider disucssion

@fuyufjh fuyufjh removed this from the release-2.1 milestone Oct 17, 2024
@xxchan xxchan changed the title Discuss: should we use background_ddl when sink_decouple is enabled? Discuss: blocking DDL with sink_decouple might have bad UX Oct 18, 2024
@fuyufjh
Copy link
Member

fuyufjh commented Oct 23, 2024

Notes from Slack discussions:

  • Target 1: If error happens, return the errors immediately
    • Should disable rewind during backfilling.
  • Target 2: If no error, wait until all data are sinked before returning success to users
    • (Preferred) Option 1. Emit a barrier when the backfilling is over, and wait for the barrier to complete before returning success to users
      • Bonus: This idea can also be used to implement a FLUSH for sinks.
    • Option 2. Disable log store or set capacity to 0 or small number when doing backfilling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants