Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Garbage Collection on the remote store. #63

Open
coffeegoddd opened this issue Jul 24, 2023 · 0 comments
Open

Add Garbage Collection on the remote store. #63

coffeegoddd opened this issue Jul 24, 2023 · 0 comments

Comments

@coffeegoddd
Copy link
Contributor

Customer using DoltLab requests garbage collection for remote data. The goal of this would be to potentially improve performance (decrease latency) of DoltLab for certain use cases. Related to this issue.

Currently DoltLab/DoltHub does not remove any stale or unused table-files from remote storage, and it's possible that this negatively impacts performance of each application.

A possible near-at-hand solution here would be to add a manual button to DoltHub/DoltLab that allows database admins the ability to trigger a "remote store garbage collection" Job. If admins notice performance issues on a DoltLab instance (or specific database) the Job would conjoin the table-files of the remote store into a single, large table-file containing all the database data and remove the superfluous table-files.

Implementing such a button on DoltHub is more complicated as it would require changes to the billing pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant