Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch-loader(example): use prefetch and try to run example in linux #691

Merged
merged 1 commit into from
Dec 11, 2024

Conversation

skshetry
Copy link
Member

No description provided.

Copy link

cloudflare-workers-and-pages bot commented Dec 11, 2024

Deploying datachain-documentation with  Cloudflare Pages  Cloudflare Pages

Latest commit: 59c7aa6
Status: ✅  Deploy successful!
Preview URL: https://4ce804fc.datachain-documentation.pages.dev
Branch Preview URL: https://run-torch-loader.datachain-documentation.pages.dev

View logs

Copy link

codecov bot commented Dec 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.31%. Comparing base (6ca3c98) to head (59c7aa6).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #691      +/-   ##
==========================================
- Coverage   87.34%   87.31%   -0.03%     
==========================================
  Files         113      113              
  Lines       10791    10791              
  Branches     1479     1479              
==========================================
- Hits         9425     9422       -3     
- Misses        989      991       +2     
- Partials      377      378       +1     
Flag Coverage Δ
datachain 87.24% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

num_workers=2,
batch_size=25,
num_workers=4,
multiprocessing_context=multiprocessing.get_context("spawn"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fsspec's loop is not fork-safe. Even though we create a new loop for each forked processes, s3fs or other filesystems may not be fork-safe.

See https://s3fs.readthedocs.io/en/latest/#multiprocessing.

This causes future from a run_coroutine_threadsafe to hang forever on prefetch.

I could contribute a fix to the first problem, but the second problem still remains.
Also, Python 3.14 is changing default start method for posix systems (except macOS which uses 'spawn') to 'forkserver'. See python/cpython#84559.

So, I think it's better to recommend to use a different start method.

@skshetry skshetry requested review from mattseddon and a team December 11, 2024 08:00
@skshetry skshetry marked this pull request as ready for review December 11, 2024 08:01
@skshetry skshetry merged commit e8812dd into main Dec 11, 2024
34 checks passed
@skshetry skshetry deleted the run-torch-loader branch December 11, 2024 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants