-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sombrero] Handle systems where number of cores doesn't divide lattice volume #250
Conversation
benchmarks/apps/sombrero/sombrero.py
Outdated
@@ -122,4 +137,7 @@ def set_up_from_parameters(self): | |||
|
|||
@run_after('setup') | |||
def setup_num_tasks(self): | |||
self.num_tasks = self.current_partition.processor.num_cores * 64 | |||
self.num_tasks = max_num_tasks( | |||
self.current_partition.processor.num_cores * 64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether 64
should multiply self.current_partition.processor.num_cores
or the result of max_num_tasks
. In the latter case we'd greatly oversubscribe the node, right? But that was already the case before, so probably that was it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latter case we'd greatly oversubscribe the node, right? But that was already the case before, so probably that was it?
I think this test was intended to run on 64 nodes so it shouldn't be oversubscribed. Maybe we should use the num_tasks_per_node
option instead and calculate the max_num_tasks
according to LATTICE_VOLUME / 64
to ensure even load balancing and that we actually use 64 nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, setting num_tasks_per_node
is cleaner in this case, I did that, thanks!
benchmarks/apps/sombrero/sombrero.py
Outdated
|
||
from benchmarks.apps.sombrero import case_filter | ||
from benchmarks.modules.reframe_extras import scaling_config | ||
from benchmarks.modules.utils import SpackTest | ||
|
||
# Fixed lattice volume in ITT benchmarks | ||
LATTICE_VOLUME = 32 * 24 * 24 * 24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the Sombrero README, I think this is only the lattice volume in the case -s small
is passed to the executable. For the 64 node test, -s medium
is currently passed so the lattice volume is
Having said that, I think this will probably still work fine since the prime factor decomposition of these numbers are both of the form
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I defined LATTICE_VOLUME_SMALL
and LATTICE_VOLUME_MEDIUM
to distinguish the two cases.
I agree that this is extremely unlikely to be a problem so I have no issues with it. |
cbdf998
to
6497528
Compare
6497528
to
34a729e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with these changes other than my 1 small comment.
I have tested that the ITT-sn
one works on the CSD3 Icelakes but unfortunately can't test the ITT-64n
one as this exceeds the max job size (I did test it worked by replacing 64
with 48
).
benchmarks/apps/sombrero/sombrero.py
Outdated
LATTICE_VOLUME_MEDIUM // 64, | ||
) | ||
self.num_tasks = self.num_tasks_per_node * 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One very tiny comment. Could you replace the two uses of the magic number 64
with a variable e.g. num_nodes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Can you please check it works for you as expected?
LGTM (but I can't approve it until you re-request a review). I have tested this works on the CSD3 Icelakes as expected (by passing |
Thanks! |
Should fix #237 (comment). I followed the suggestion in #237 (comment). Probably this isn't the most efficient algorithm out there, but this takes a bunch of microseconds on my laptop, so hopefully it won't be a major performance bottleneck:
CC: @mirenradia.