-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Cancel and Start again within 1s caused module not found [v2] #5007
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Specifically with nmslib installation was hanging without any output, even tho library build was failed. It can be monitored with strace or catp.
…/windmill into fix-py-non-exiting-install
Deploying windmill with Cloudflare Pages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to bc71804 in 2 minutes and 1 seconds
More details
- Looked at
115
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. backend/windmill-worker/src/python_executor.rs:1359
- Draft comment:
Consider adding error handling to ensure the lock is released in case of an error during the installation process. This will prevent potential deadlocks. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment shows a misunderstanding of Rust's RAII principles. The Mutex lock will be automatically released when it goes out of scope, regardless of whether the code completes normally or encounters an error. This is handled by Rust's drop system. The _lock binding ensures the lock is held for the entire scope. No additional error handling is needed for lock release.
Could there be some edge cases where the lock might not be released properly that I'm not considering? What if the process is killed abruptly?
Even in case of abrupt process termination, Rust's drop system will still run destructors and release the lock. The futures::lock::Mutex is specifically designed for async contexts and handles cleanup properly. The comment is suggesting unnecessary additional error handling.
The comment should be deleted as it suggests unnecessary error handling. Rust's RAII system already guarantees the lock will be released when it goes out of scope, even in error cases.
2. backend/windmill-worker/src/python_executor.rs:1528
- Draft comment:
Consider adding error handling to ensure the lock is released in case of an error during the installation process. This will prevent potential deadlocks. - Reason this comment was not posted:
Marked as duplicate.
3. backend/windmill-worker/src/python_executor.rs:1778
- Draft comment:
Consider adding error handling to ensure the lock is released in case of an error during the installation process. This will prevent potential deadlocks. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_sKzt3bTPAw4nCHF5
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
when we cancel the job, it has up to 1 second window before actually getting cancelled (Because we pull db every 1s to ping it and find out if job has been cancelled). Thus the directory with wheel in windmill's cache cleaned only after that . If we manage to start new job during that period windmill might see that wanted wheel is already there (because we have not cleaned it yet) and write it to installed wheels, meanwhile previous job will clean that wheel. That's why some users could get "module not found" error if they restart job rapidly
Important
Fix race condition in
python_executor.rs
by serializing UV installations with a mutex to prevent 'module not found' errors when canceling and restarting jobs quickly.BUSY_WITH_UV_INSTALL
mutex inpython_executor.rs
to serialize UV installations, preventing race conditions when canceling and restarting jobs quickly.handle_python_reqs()
to lock UV installations, ensuring that a new job does not interfere with the cleanup of a previous job's resources.handle_python_reqs()
to properly await stderr before checking process exit status, preventing hang-ups.-q
flag fromspawn_uv_install()
command arguments.This description was created by for bc71804. It will automatically update as commits are pushed.