-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import tqdm to support jupyter better #812
Conversation
2dd0683
to
cd2d902
Compare
Deploying datachain-documentation with Cloudflare Pages
|
cd2d902
to
3c2e0cf
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #812 +/- ##
==========================================
+ Coverage 87.42% 87.65% +0.22%
==========================================
Files 128 128
Lines 11373 11333 -40
Branches 1537 1530 -7
==========================================
- Hits 9943 9934 -9
+ Misses 1049 1018 -31
Partials 381 381
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
We should not use progress bars inside |
we have a single progress bar there (per udf), we don't pass it to all the subprocesses - so it works fine in CLI. We run a wrapper process though to avoid main, that process controls progress (not processes that actually run UDFs) and that process doesn't have access to JUpyter. |
We do have some UDFs that create progress bars - such as ArrowGenerator and maybe Listing? |
yep, may be. But even in that case this PR is most likely an improvement. We can decide and review and remove some internal progress bars. I'm not sure why we use / need them. |
from tqdm import tqdm | ||
|
||
from datachain.utils import env2bool | ||
from tqdm.auto import tqdm | ||
|
||
logger = logging.getLogger(__name__) | ||
tqdm.set_lock(RLock()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this too when we can.
Maybe too much detail, but tqdm
does locking by default which is multiprocess-safe. But since we overwrite this, this is only thread-safe.
dvc-data
does reset it, so we'll be overriding it anyway as we are importing it someplace. But I hope we can remove it someday there. I have no idea why this was done in dvc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I have a suspicion that this is what causes progressbars to jump in dvc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dvc-data
fixed in iterative/dvc-data#592.
Partially addresses #780
Screen.Recording.2025-01-12.at.7.30.04.PM.mov
What I don't know how to fix yet is progress bars when we use
parallel
and we run the whole thing in the subprocess. In this case child process doesn't have access to Ipython widgets (or whatever the mechanics is behind these nice progress bars).