You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Uploading a dataset to the Hub comprises multiple, independent steps. The first step creates an entry in our database, whereas any subsequent steps upload the actual files to the Hub's storage backend. If any of these subsequent steps fail, the dataset will be visible on the Hub to its owner, but will never be marked as ready and remains trapped in limbo.
When you delete such a dataset, you will never be able to recreate a dataset with the same name because Polaris uses soft-deletion and artifact names are unique.
Describe the solution you'd like
For any dataset that has been created in the Hub's database, but for which some of the file uploads failed, it should be possible to retry the failed uploads. It should also be clearly communicated that this is the recommended next step once an upload fails.
Describe alternatives you've considered
Alternatively, we could:
Switch from soft-deletes to hard-deletes. That way, a user could just delete a failed upload and try again. My worry is that this leads to a worse overall user experience, because data cannot be recovered if it is accidentally deleted.
Update our mechanism to ensure uniqueness of the slug across non-deleted only artifacts. This could get technically complex, e.g. what if a user wants to recover a delete artifact but has created a new dataset with the same name in the meantime?
The main blocker is that we need to clean up the bucket data, which we don't do now. If we have a deletion workflow that cleans up everything, then there's no problem with a hard delete anymore.
Is your feature request related to a problem? Please describe.
Uploading a dataset to the Hub comprises multiple, independent steps. The first step creates an entry in our database, whereas any subsequent steps upload the actual files to the Hub's storage backend. If any of these subsequent steps fail, the dataset will be visible on the Hub to its owner, but will never be marked as ready and remains trapped in limbo.
When you delete such a dataset, you will never be able to recreate a dataset with the same name because Polaris uses soft-deletion and artifact names are unique.
Describe the solution you'd like
For any dataset that has been created in the Hub's database, but for which some of the file uploads failed, it should be possible to retry the failed uploads. It should also be clearly communicated that this is the recommended next step once an upload fails.
Describe alternatives you've considered
Alternatively, we could:
Additional context
This issue came up in #147
The text was updated successfully, but these errors were encountered: