You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the correct sequence of maintenance steps to run on an Iceberg table? Our tables are write-once-read-many so I am not sure if I need to run rewrite_position_delete_files or not.
I am not very experienced in Iceberg so I wanted to see if I am running them in the correct order or am I missing something here:
Right now my sequence looks like this:
rewrite_data_files
rewrite_manifests
expire_snapshots
I read it on a blog that I don't have to run rewrite_manifests before or after running rewrite_data_files as rewrite_data_files rewrites manifests but I am finding the results contradictory based on what I am seeing.
With the sequence I specified, this is how the results look like for rewrite_data_files and rewrite_manifests calls.
Result of rewrite_data_files
rewritten_data_files_count
added_data_files_count
rewritten_bytes_count
failed_data_files_count
13507
2371
4307611669
0
Result of rewrite_manifests
rewritten_manifests_count
added_manifests_count
29
25
P.S. My intention is to have only one snapshot at the end of the process so I am providing older_than in expire_snapshots.
The text was updated successfully, but these errors were encountered:
Query engine
Spark, AWS Glue
Question
What is the correct sequence of maintenance steps to run on an Iceberg table? Our tables are write-once-read-many so I am not sure if I need to run
rewrite_position_delete_files
or not.I am not very experienced in Iceberg so I wanted to see if I am running them in the correct order or am I missing something here:
Right now my sequence looks like this:
rewrite_data_files
rewrite_manifests
expire_snapshots
I read it on a blog that I don't have to run
rewrite_manifests
before or after runningrewrite_data_files
asrewrite_data_files
rewrites manifests but I am finding the results contradictory based on what I am seeing.With the sequence I specified, this is how the results look like for
rewrite_data_files
andrewrite_manifests
calls.Result of
rewrite_data_files
Result of
rewrite_manifests
P.S. My intention is to have only one snapshot at the end of the process so I am providing
older_than
inexpire_snapshots
.The text was updated successfully, but these errors were encountered: