-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@batch
uses only the first 64 threads
#83
Comments
Hmm. using Polyester
Polyester.num_threads()
length(Polyester.ThreadingUtilities.TASKS) ? |
The two most likely places this can be going wrong:
|
Sorry about repeatedly closing this issue. Apparently |
I get : julia> using Polyester
julia> Polyester.num_threads()
static(128)
julia> length(Polyester.ThreadingUtilities.TASKS)
127 So it seems that it correctly detects the number of threads. |
I finally found some time to try to debug this issue. The source of the problem seems to be the function I also spotted some weird behaviors caused by this line: since Should I create pull requests for my fixes? |
It is called in a loop here: Lines 159 to 172 in 2926208
Yeah, that definitely looks like a bug/like I wrote it assuming |
I just tried to run the original code where I first encountered this issue, and my fixes didn't work. Turns out that somehow I was using the The node has a CPU (AMD EPYC 7H12) with 2 sockets, each with 64 cores on them, but because of hyperthreading there is 2 threads per core. Therefore I am on a x86_64 architecture, and by looking at
But the problem is in the function itself: It only considers the first element of the bit mask tuple since |
At one point in time, it was important to support WINE, but I believe this is no longer the case. We can switch back. Feel free to make a PR reverting main to version 0.1.8 or so. |
Ah, you're right. That's definitely a bug. |
I am currently working with a 128 cores (64 cores x 2 sockets) CPU, and I noticed that Polyester.jl will only use the first 64 threads available :
Notice how the threads 65 to 128 were ignored by
@batch
but not byThreads.@threads
.I get similar results with hyperthreading with 256 threads.
I am quite sure that this also affects LoopVectorization.jl since I am getting the same time for some very simple benchmarks :
Gives ~3.0 µs for 64 or 128 threads.
I suppose that it is related to the behaviors mentioned in #22. In any case I would be happy to help resolving this issue.
The text was updated successfully, but these errors were encountered: