-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Try refresh Iceberg table on file not found #49551
Conversation
Signed-off-by: Samrose Ahmed <[email protected]>
Quality Gate failedFailed conditions See analysis details on SonarCloud Catch issues before they fail your Quality Gate with our IDE extension SonarLint |
[FE Incremental Coverage Report]❌ fail : 1 / 20 (05.00%) file detail
|
[BE Incremental Coverage Report]❌ fail : 3 / 4 (75.00%) file detail
|
@before-Sunrise @Smith-Cruise please take a look at this PR at your convenience. |
@Samrose-Ahmed The code coverage on FE side doesn't look good. Please take a look and check if more cases or sql-tester can be added to increase the coverage. |
@stephen-shelby PTAL |
hi, thanks for your contribution. Iceberg refresh costs far more than hive. hive table only need to refresh cached partition or files. but iceberg catalog need to refresh all table files if you don't pass predicate. we provide a background refresh every 10mins. If you have strict requirements for timeliness and accuracy. you could disable iceberg metadata cache in the catalog properties. the property key is "enable_iceberg_metadata_cache". |
That's kinda fine, I solved this for us in a different way by not physically deleting quickly so its not critical. You can close this. |
so, how did you solve this issue? |
We just adjust don't delete so aggressively, it's ok for us to have some stale result. |
Why I'm doing:
Currently, if in Iceberg a compaction/deletion happens and file is physically deleted from S3, if the Iceberg metadata is cached, it can result in repeated query failure with Remote file not found.
What I'm doing:
Make sure RemoteFileNotFound exception is propagated to FE and refresh Iceberg metadata on encountering. Also try to replan and retry.
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
Documentation PRs only:
If you are submitting a PR that adds or changes English documentation and have not
included Chinese documentation, then you can check the box to request GPT to translate the
English doc to Chinese. Please ensure to uncheck the Do not translate box if translation is needed.
The workflow will generate a new PR with the Chinese translation after this PR is merged.