-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Orchagent] Add optional create_switch timeout parameter #3258
Conversation
@prsunny can you help review? @kcudnik this is the alternative way to allow some low performance switch run higher SONiC version |
Change adds optional create_switch timeout as command line parameter to orchagent. Signed-off-by: Pavan Naregundi <[email protected]>
dvs.runcmd("cp /usr/bin/orchagent.sh /usr/bin/orchagent.sh_zmq_ut_backup") | ||
dvs.runcmd("sed -i.bak 's/\/usr\/bin\/orchagent /\/usr\/bin\/orchagent -q tcp:\/\/127.0.0.1:8100 /g' /usr/bin/orchagent.sh") | ||
dvs.runcmd("sed -i.bak 's/\/usr\/bin\/orchagent /\/usr\/bin\/orchagent -q tcp:\/\/127.0.0.1:8100 -t 60 /g' /usr/bin/orchagent.sh") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this specific test is changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the change is in main() passing a new argument to orchagent, found we that we are doing a similar argument change in this test case. This change does not affect the behavior of the test as timeout value we are sending is still same as default(60 sec). But helps to cover the UT for new code change.
|
||
void usage() | ||
{ | ||
cout << "usage: orchagent [-h] [-r record_type] [-d record_location] [-f swss_rec_filename] [-j sairedis_rec_filename] [-b batch_size] [-m MAC] [-i INST_ID] [-s] [-z mode] [-k bulk_size] [-q zmq_server_address] [-c mode]" << endl; | ||
cout << "usage: orchagent [-h] [-r record_type] [-d record_location] [-f swss_rec_filename] [-j sairedis_rec_filename] [-b batch_size] [-m MAC] [-i INST_ID] [-s] [-z mode] [-k bulk_size] [-q zmq_server_address] [-c mode] [-t create_switch_timeout]" << endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning to change the existing timeout? How will this be controlled from user perspective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option provides the flexibility to override the existing timeout, it is optional because we most of platforms' performance are good enough to create switch within default timeout.
The timeout setting can be controlled by platform designer, e.g. https://github.com/sonic-net/sonic-buildimage/pull/19928/files
@pavannaregundi can you create PR for 202311 as there is conflict? |
ADO: 29208990 |
Backport of PR sonic-net#3258 Signed-off-by: Rajkumar P R <[email protected]>
switch_create must take less than 30 seconds on warm boot scenario to not cause dataplane interruption, higher values are not acceptable in production at all |
Cherry-pick PR to 202405: #3324 |
What I did Change adds optional create_switch timeout as command line parameter to orchagent. Why I did it Older platforms are seeing increase in time required in bookworm based branches.
Backport of PR #3258 Signed-off-by: Rajkumar P R <[email protected]>
Change adds optional create_switch timeout as command line parameter to orchagent.
What I did
Change adds optional create_switch timeout as command line parameter to orchagent.
Why I did it
Older platforms are seeing increase in time required in bookworm based branches.
How I verified it
Changes are verified to see passed timeout value is reflected only for create_switch api call from orchagent.
Details if related
Migrating some older armhf platform from SONiC 202311 to master and 202405 branch, resulted in syncd create_switch timeout.
It is analyzed to see that SAI SDK code remains the same and time consume by SAI SDK during init when some(snmp mgmt-framework lldp gnmi dhcp_relay radv teamd pmon bgp) of the docker in sonic are disabled is same as running on 202311(with same environment).
Some of the other observations are: