DFS: Update pilot config for the compaction feature #232
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this Pull Request accomplish?
Adds a modest Iceberg engine for pilot deployments.
Why should this Pull Request be merged?
To isolate reads from writes at runtime, and to provide Dremio with enough resources to compact and add data to tables of modest size, we need to provision an Iceberg execution engine. We also need to adjust the number of concurrent compaction and ingestion jobs we ask Dremio to handle so that we don't overwhelm the modest Iceberg engine we're deploying for pilots.
We shouldn't merge this until we're ready to enable compaction by default.
What testing has been done?
I verified this Iceberg engine can handle a burst of 30 concurrent reference table writers. That is, 30 threads writing 200k rows to a different table with 200 FLOAT64 columns in 4k batches. All data was compacted and made available for query within 10 minutes of ingestion starting.