-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update default behavior for max threads PA async mode #737
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pushing this PR! Great work! Could you add context to the PR description? I don't understand why 16 was the default before rather than the max concurrency. I don't understand why the logic is not consistent across concurrency values and why this change was necessary.
Also, it would be nice to clean up the prior code if possible. 16 is a magic number. We should have a default somewhere (e.g. DEFAULT_MIN_THREADS) that had an explanation of why that number was chosen. It'd also be good to then carry the logic of adding context/naming through to the unit tests to make it clear what they are testing.
@@ -1210,7 +1267,7 @@ TEST_CASE("Testing Command Line Parser") | |||
|
|||
CheckValidRange( | |||
args, option_name, parser, act, exp->is_using_periodic_concurrency_mode, | |||
exp->periodic_concurrency_range); | |||
exp->periodic_concurrency_range, exp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't these two arguments redundant? Why do you need to pass both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i will clean this up.
@@ -1173,6 +1176,60 @@ TEST_CASE("Testing Command Line Parser") | |||
|
|||
check_params = false; | |||
} | |||
|
|||
SUBCASE("concurrency-range.end < 16") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you name these in a way that makes it clear not only what they are testing but why. So something like "default thread count for max concurrency < 16" and "non-default thread count for max-concurrency > 16". There might be better naming since I am missing context here, but the naming should make it obvious what is being tested and why (i.e. the context) to someone without that context without going back to the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, i will change the name.
Based on a comment from @nv-hwoo, it sounds like this fixes async mode whereas sync mode already had similar behavior. If so, can you please specify that in the PR title and description? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the feedback! Added a few comments that could make the test cases even better.
@@ -1126,12 +1129,12 @@ TEST_CASE("Testing Command Line Parser") | |||
|
|||
exp->using_concurrency_range = true; | |||
exp->concurrency_range.start = 100; | |||
exp->max_threads = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to use the constant instead of the magic number here.
exp->using_concurrency_range = true; | ||
exp->concurrency_range.start = 10; | ||
exp->concurrency_range.end = 10; | ||
exp->max_threads = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant instead of magic number here as well.
"16") | ||
{ | ||
args.push_back(option_name); | ||
args.push_back("10:40"); // start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the start comment there for a reason in these test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an already existing comment. As far as i remember, I didn't add it. Let me recheck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
exp->using_concurrency_range = true; | ||
exp->concurrency_range.start = 10; | ||
exp->concurrency_range.end = 40; | ||
exp->max_threads = 40; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you set this to equal the concurrency range end? That avoids a magic number situation here as well.
Actually, can you set the concurrency_range_start and concurrency_range_end at the beginning and use it in place of every number for all of these tests? That way all the numbers are guaranteed to be in sync (can be changed from one place) and it's immediately obvious when reading it what each number is. Whenever possible, we want to avoid using magic numbers, especially in a situation like this where the number is used more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will modify according to comments in a bit.
Currently, max_threads in PA is by default set to 16 in async mode. While using genai-perf, concurrency value can be passed as a CLI argument. However, the max_threads used by PA is still 16 although concurrency is set to a higher value. This change sets max_threads to concurrency if concurrency > 16. If concurrency <= 16, max_threads is by default set to 16.