Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make state table visible for creating MV #19031

Open
Tracked by #19084
xxchan opened this issue Oct 21, 2024 · 4 comments
Open
Tracked by #19084

Make state table visible for creating MV #19031

xxchan opened this issue Oct 21, 2024 · 4 comments
Assignees
Milestone

Comments

@xxchan
Copy link
Member

xxchan commented Oct 21, 2024

Currently we hide both MV and their state tables during backfilling. However, I think there's no need to hide the state tables:

  1. Hiding state tables don't have large benefits, compared with hiding the MV (e.g., for consistency), since state table is an internal thing.
  2. For backfill executors' state tables, they are only meaningful during the backfill stage. Their content can also be viewed as an observability tool for backfilling. So it's beneficial to expose them.

related:

@BugenZhao
Copy link
Member

Since #17503, we've already broadcast all table catalogs (including both internal tables and the MV) to the frontend immediately when meta starts the creating procedure. You can verify this by SHOW INTERNAL TABLES.

Selecting from internal tables is intentionally disabled, by only resolving the "created" table during binding.

} else if let Some(table_catalog) =
schema.get_created_table_by_name(table_name)

@BugenZhao
Copy link
Member

However, as described in #18944, the catalogs received by the frontend during creation are incomplete, with fields like fragment_id or vnode_count not correctly filled. Performing batch scan during this period may lead to problem.

Perhaps this is the real motivation for the refactor of only notifying the complete catalogs once to the frontends, which is the original idea of #18944:

image

@kwannoel
Copy link
Contributor

However, as described in #18944, the catalogs received by the frontend during creation are incomplete, with fields like fragment_id or vnode_count not correctly filled. Performing batch scan during this period may lead to problem.

Perhaps this is the real motivation for the refactor of only notifying the complete catalogs once to the frontends, which is the original idea of #18944:
image

I suppose your proposal is to notify the catalogs to frontend, once the TableFragments are built, since only at that time the fragment_id and vnode_count will be correct. Sounds reasonable to me. Wdyt @yezizp2012.

This can allow us to expose the internal state table of backfill for querying.

@kwannoel
Copy link
Contributor

Tracked this issue as part of: #19084. To allow better management of stream job creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants