-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add studio flag for rm-dataset and edit-dataset #572
base: main
Are you sure you want to change the base?
Conversation
This adds a support for `--studio` flag for edit-dataset and rm-dataset command. If the --studio flag is passed, it will use the studio client to process the operation. Some example are as: - `datachain rm-dataset "new_test_dataset" --studio --version 1` - `datachain edit-dataset png_files --studio --new-name new_dataset_name` TODO: - Add test Studio PR: iterative/studio#10890
Deploying datachain-documentation with Cloudflare Pages
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #572 +/- ##
==========================================
+ Coverage 87.83% 87.84% +0.01%
==========================================
Files 100 100
Lines 9993 10024 +31
Branches 1356 1363 +7
==========================================
+ Hits 8777 8806 +29
+ Misses 873 872 -1
- Partials 343 346 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Can be a follow up, but let's rename all these command (similar to datachain ds ls |
We need to figure out docs for this as well. |
studio: bool = False, | ||
team: Optional[str] = None, | ||
): | ||
if studio: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] Is there a reason for deviating from the behaviour implemented in #561? Why can't a user edit/remove a dataset from both local and remote in the same call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For list operation, having the option to list the datasets from both local and remote seems like a frequent operation.
But I am not sure how helpful it will edit or remove dataset from both local and remote in the same call.
I was also looking at the implementation of pull
where ds://
prefix is used to identify if it is remote or if it is local. I don't know if it will be confusing to user though to implement it in edit and remove as well. cc. @shcheklein
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I am not sure how helpful it will edit or remove dataset from both local and remote in the same call.
If you think about this in terms of #578 users would definitely want to be able to rename both at the same time. We want to avoid users ending up with datasets with the same identifier being named differently between local and remote as much as possible.
body["labels"] = labels # type: ignore[assignment] | ||
|
||
return self._send_request( | ||
"datachain/edit-dataset", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to the groups of commands, etc - let's please make API also clean
datachain/datasets/ls
datachain/datasets/rename
-etc
rm-dataset
and similar look very weird for REST api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could even be a single endpoint that lists on GET
, renames via PUT
(assuming dataset name is not part of a resource URI as PUT
has to be idempotent), removes via DELETE
.
Since dataset is part of a team, ideally the URI should be /datachain/<team>/datasets
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this comment to the #576 as well. We can take it as a part of refactor.
This adds a support for
--studio
flag for edit-dataset and rm-datasetcommand. If the --studio flag is passed, it will use the studio client
to process the operation.
Some example are as:
datachain rm-dataset "new_test_dataset" --studio --version 1
datachain edit-dataset png_files --studio --new-name new_dataset_name
TODO:
Studio PR: https://github.com/iterative/studio/pull/10890