-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: GC causing "org.apache.iceberg.exceptions.NotFoundException: File does not exist" #9749
Comments
Maybe this is related as well #8263 |
@dorsegal can you provide more details, or better a reproducer? |
My setup is with kafka iceberg connect.
I can provide more logs if needed just don't know which one. From GC logs I see that it deleted some files. |
What I meant is a full reproducer mentioning every step starting from scratch, so that s/o can get to the same behavior on a "clean"/empty environment. |
failed:
After expire snapshot in Spark SQL: CALL nessie.system.expire_snapshots('nessie.robot_dev.robot_data', TIMESTAMP '2024-10-15 00:00:00.000', 1) Count of snapshots reduced and manifest files have been deleted. But Nessie metadata maybe not sync the changes of snapshots |
@yunlou11 thanks for the information. But what are all the necessary steps to get to that error message? Aka, what was all done before running GC? |
Sorry, It's maybe Iceberg error, not nessie:
iceberg issue: |
When I fixed iceberg |
What happened
After I used GC I started to get file does not exist error.
Looks like the file was deleted but was not deleted from metadata.
apache/iceberg#8338
How to reproduce it
Nessie server type (docker/uber-jar/built from source) and version
kubernetes 0.99.0
Client type (Ex: UI/Spark/pynessie ...) and version
Spark
Additional information
No response
The text was updated successfully, but these errors were encountered: