Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DFS: Update pilot config for the compaction feature #232

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pvallone
Copy link
Contributor

What does this Pull Request accomplish?

Adds a modest Iceberg engine for pilot deployments.

Why should this Pull Request be merged?

To isolate reads from writes at runtime, and to provide Dremio with enough resources to compact and add data to tables of modest size, we need to provision an Iceberg execution engine. We also need to adjust the number of concurrent compaction and ingestion jobs we ask Dremio to handle so that we don't overwhelm the modest Iceberg engine we're deploying for pilots.

We shouldn't merge this until we're ready to enable compaction by default.

What testing has been done?

I verified this Iceberg engine can handle a burst of 30 concurrent reference table writers. That is, 30 threads writing 200k rows to a different table with 200 FLOAT64 columns in 4k batches. All data was compacted and made available for query within 10 minutes of ingestion starting.

@pvallone pvallone self-assigned this Nov 20, 2024
@pvallone
Copy link
Contributor Author

Reminder not to merge this until we're ready to enable compaction by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants