Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SmartSwitch support in chassisd and enabling chassisd #467

Open
wants to merge 137 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
137 commits
Select commit Hold shift + click to select a range
0785bb1
Added SmartSwitch support in chassisd and enabling chassisd for fixed
rameshraghupathy Apr 15, 2024
c4725c4
Made temp fix to avoid chassisd tests passing. Need to handle it using
rameshraghupathy Apr 28, 2024
d1a6b2a
The test_chassisd needs to updated for smartswitch. The change is a
rameshraghupathy Apr 29, 2024
af4ff55
Addressing review comments
rameshraghupathy May 31, 2024
356fa8b
Merge remote-tracking branch 'upstream/master'
rameshraghupathy Jun 4, 2024
97b6518
Fixed the merge conflict breakage in "asic_table"
rameshraghupathy Jun 5, 2024
60cb8ed
checking test_chassisd.py is ok
rameshraghupathy Jun 5, 2024
cd6db9b
Trying to resolve test failure
rameshraghupathy Jun 5, 2024
8e958ba
Using is_smartswitch in MockChassis
rameshraghupathy Jun 10, 2024
fb7b4ac
Disabling the test until the dependency is committed
rameshraghupathy Jun 10, 2024
312129f
Addressing review comments
rameshraghupathy Jun 11, 2024
ce3bc30
Addressed review comments: 1. removed SWITCH module 2. Restored
rameshraghupathy Jul 9, 2024
ca1fe7c
As per one of the review comments created derived class for SmartSwitch
rameshraghupathy Aug 5, 2024
b2ac82c
Uncommented the previously commented blocks in test_chassisd
rameshraghupathy Aug 5, 2024
3713261
Temp workaround until
rameshraghupathy Aug 5, 2024
29c82d2
Setting slot to 'N/A' for smartswitch
rameshraghupathy Sep 1, 2024
a34457f
Fixed is_smartswitch and a line at the EOF
rameshraghupathy Sep 2, 2024
2c68e62
Testing
rameshraghupathy Sep 2, 2024
4d335ba
Improving coverage
rameshraghupathy Sep 2, 2024
1f96225
Adding tests for DPU
rameshraghupathy Sep 2, 2024
68d8671
debugging
rameshraghupathy Sep 2, 2024
28d09af
Fixed test issues
rameshraghupathy Sep 2, 2024
67e24a9
Fixing test issues
rameshraghupathy Sep 2, 2024
36dbf92
fixing ut
rameshraghupathy Sep 2, 2024
7b6b094
fixed a typo
rameshraghupathy Sep 2, 2024
8d7f96f
Adding ut for smartswitch config change events
rameshraghupathy Sep 2, 2024
4e4f4de
Trying to improve coverage
rameshraghupathy Sep 2, 2024
8daf0bb
Tuning to improve coverage
rameshraghupathy Sep 2, 2024
6f02d44
workign on coverage
rameshraghupathy Sep 2, 2024
77d3901
working on coverage
rameshraghupathy Sep 2, 2024
e0a5570
Fixed syntax issues in ut
rameshraghupathy Sep 2, 2024
5833cfb
working on coverage
rameshraghupathy Sep 2, 2024
b26c238
Improving coverage
rameshraghupathy Sep 2, 2024
c08a656
task_worker can not be tested in this workflow
rameshraghupathy Sep 2, 2024
c0cc783
Fixed some minor issues
rameshraghupathy Sep 2, 2024
bbdc876
Adding more coverage
rameshraghupathy Sep 2, 2024
a54724c
Minor fix
rameshraghupathy Sep 2, 2024
e7df894
Minor fixes
rameshraghupathy Sep 2, 2024
a181d2f
Fixed minor errors
rameshraghupathy Sep 2, 2024
b6a8dde
Adding test for task_worker
rameshraghupathy Sep 2, 2024
3c35dd8
Fixing test failure
rameshraghupathy Sep 2, 2024
c1c0c9c
Debugging ut
rameshraghupathy Sep 2, 2024
c58459f
Testing
rameshraghupathy Sep 3, 2024
aecd959
Testing
rameshraghupathy Sep 3, 2024
65b37f4
Testing
rameshraghupathy Sep 3, 2024
df25c41
testing
rameshraghupathy Sep 3, 2024
7aa6c87
Testing
rameshraghupathy Sep 3, 2024
360cfd2
testing
rameshraghupathy Sep 3, 2024
7ee5c84
testing
rameshraghupathy Sep 3, 2024
ef931cd
testing
rameshraghupathy Sep 3, 2024
a74b278
Testing
rameshraghupathy Sep 3, 2024
523ea43
Testing
rameshraghupathy Sep 3, 2024
4f763c4
testing
rameshraghupathy Sep 3, 2024
332139c
Testing
rameshraghupathy Sep 3, 2024
53dbb67
Did some cosmetic cleanup
rameshraghupathy Sep 3, 2024
c295660
Addressed some review comments, added cleanup for smartswitch, added
rameshraghupathy Sep 26, 2024
655be48
Will add the set_initial_dpu_admin_state as a function in the next push
rameshraghupathy Sep 26, 2024
cd483bd
Fixed the test cases as per the modified code for
rameshraghupathy Sep 27, 2024
4b200c7
Added a function to set dpu initial admin status
rameshraghupathy Oct 1, 2024
49868a5
Improving coverage
rameshraghupathy Oct 1, 2024
b366b2d
Trying to improve coverage
rameshraghupathy Oct 1, 2024
82066d1
Fixed a typo
rameshraghupathy Oct 1, 2024
43f3661
Working on coverage
rameshraghupathy Oct 1, 2024
f62d54c
Fixed a typo
rameshraghupathy Oct 1, 2024
001a985
Fixing sytax issues
rameshraghupathy Oct 1, 2024
fa7fa71
Adding a test to improve coverage
rameshraghupathy Oct 1, 2024
c92d234
Assigned localy module_updater to daemon.module_updater
rameshraghupathy Oct 2, 2024
9554bb3
Fixed a syntax error
rameshraghupathy Oct 2, 2024
6ef579e
Resolving syntax errors
rameshraghupathy Oct 2, 2024
482d9c8
Fixing test issues
rameshraghupathy Oct 2, 2024
84c6812
Fixing test failure
rameshraghupathy Oct 2, 2024
b31be6f
Debugging
rameshraghupathy Oct 2, 2024
dd91702
Debugging
rameshraghupathy Oct 2, 2024
c7d02f6
Debugging
rameshraghupathy Oct 2, 2024
52a5007
Debugging
rameshraghupathy Oct 2, 2024
598220b
trying 2 tests
rameshraghupathy Oct 2, 2024
6e90f3b
Debugging
rameshraghupathy Oct 2, 2024
2fbcd37
Debugging
rameshraghupathy Oct 2, 2024
007d1a8
Trying to improve coverage
rameshraghupathy Oct 2, 2024
cb90f02
Debugging test
rameshraghupathy Oct 2, 2024
9a74992
Fixed a typo
rameshraghupathy Oct 2, 2024
1342d19
fixed a test failure
rameshraghupathy Oct 2, 2024
9eeda1d
debugging test
rameshraghupathy Oct 2, 2024
9e25701
mocking get_module
rameshraghupathy Oct 2, 2024
6e29e88
Removed CHASSIS_MODULE_INFO_ASICS for smartswitch
rameshraghupathy Oct 2, 2024
2ae658d
Added a docstring and updated a comment
rameshraghupathy Oct 11, 2024
a55ecac
Fixed a thread issue and removed all locks
rameshraghupathy Oct 19, 2024
54accfe
Added support to persist reboot-cause, user defined reboot timeout,
rameshraghupathy Oct 22, 2024
466f6d3
Added get_reboot_cause() to mock_platform.py
rameshraghupathy Oct 23, 2024
43e6b61
Added the necessary changes for dark-mode dpu initial admin status,
rameshraghupathy Oct 29, 2024
5a81fc7
Fixing test issues
rameshraghupathy Oct 29, 2024
53ce3a6
Added a mock for file open
rameshraghupathy Oct 30, 2024
094c0ad
working on test coverage
rameshraghupathy Oct 30, 2024
55fd678
Initialized previous reboot time
rameshraghupathy Oct 30, 2024
c424271
Fixed time format
rameshraghupathy Oct 30, 2024
90e9825
For some reason the test thinks that "reboot_cause" is a dict. So
rameshraghupathy Oct 30, 2024
ee11414
checking if the file_path exists before creating a new symlink
rameshraghupathy Oct 30, 2024
153962b
Adding some error handling
rameshraghupathy Oct 30, 2024
465bf31
Adding error handling
rameshraghupathy Oct 30, 2024
1de545e
Adding error handling
rameshraghupathy Oct 30, 2024
d2969bd
Adding error handling
rameshraghupathy Oct 30, 2024
89825b8
Improving coverage
rameshraghupathy Oct 30, 2024
4e91144
Adding test cases for coverage
rameshraghupathy Oct 30, 2024
ad3fc74
fixed dpu_reboot_timeout
rameshraghupathy Oct 30, 2024
529fdcd
working on coverage
rameshraghupathy Oct 30, 2024
d19cc75
working on coverage
rameshraghupathy Oct 30, 2024
8421109
Merge branch 'master' into master
rameshraghupathy Nov 1, 2024
760b73d
Resolved merge conflicts
rameshraghupathy Nov 3, 2024
196c674
Resolved indentation issue after merge
rameshraghupathy Nov 3, 2024
b1ad5b7
reboot-cause tested as per the modified design
rameshraghupathy Nov 4, 2024
c0cfea6
Resolving merge related test failures
rameshraghupathy Nov 4, 2024
bda9d11
Fixed merge related issues and added dpu state update on init as well
rameshraghupathy Nov 4, 2024
0e8e6f3
Added get_my_slot() to mock-platform
rameshraghupathy Nov 4, 2024
ad47e68
Fixing test failure
rameshraghupathy Nov 4, 2024
130bcb7
Resolving test failures due to merge
rameshraghupathy Nov 4, 2024
9c09a49
Working on coverage
rameshraghupathy Nov 5, 2024
d5462fd
Updated reboot-cause and dpu-state sections for smartswitch. Also,
rameshraghupathy Nov 11, 2024
9a2657c
Taking slot and supervisor-slot out of smartswitch
rameshraghupathy Nov 11, 2024
4f3236e
The show chassis modules status uses the slot field
rameshraghupathy Nov 11, 2024
3b98eb7
working on coverage
rameshraghupathy Nov 12, 2024
1134008
fixing sytax
rameshraghupathy Nov 12, 2024
a84f012
fixing test issues
rameshraghupathy Nov 12, 2024
6984ab6
fixed a test sytax issue
rameshraghupathy Nov 12, 2024
318960d
using the original mock_open instead of lambda
rameshraghupathy Nov 12, 2024
d2cfee1
Fixed some syntax issues in the tests
rameshraghupathy Nov 12, 2024
52b37f9
Trying to fix test errors
rameshraghupathy Nov 12, 2024
5ad1343
Experimenting mock_open
rameshraghupathy Nov 13, 2024
bad6e88
Trying to fix the mock_open issue
rameshraghupathy Nov 13, 2024
334a473
woking on mock_open
rameshraghupathy Nov 13, 2024
1850ff7
Trying a different approach
rameshraghupathy Nov 13, 2024
1f2a3f6
Adding more tests
rameshraghupathy Nov 13, 2024
26552de
Debugging test failure
rameshraghupathy Nov 13, 2024
a1ded43
Debugging test issue
rameshraghupathy Nov 13, 2024
b254b23
Fixing syntax
rameshraghupathy Nov 13, 2024
981bad6
Fixed a pathh issue
rameshraghupathy Nov 13, 2024
635f1c8
Working on coverage
rameshraghupathy Nov 13, 2024
ee5574c
Fixed indentation issue
rameshraghupathy Nov 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
244 changes: 132 additions & 112 deletions sonic-chassisd/scripts/chassisd
Original file line number Diff line number Diff line change
Expand Up @@ -148,11 +148,13 @@ class ModuleConfigUpdater(logger.Logger):
def module_config_update(self, key, admin_state):
if not key.startswith(ModuleBase.MODULE_TYPE_SUPERVISOR) and \
not key.startswith(ModuleBase.MODULE_TYPE_LINE) and \
not key.startswith(ModuleBase.MODULE_TYPE_FABRIC):
not key.startswith(ModuleBase.MODULE_TYPE_FABRIC) and \
not key.startswith(ModuleBase.MODULE_TYPE_DPU):
self.log_error("Incorrect module-name {}. Should start with {} or {} or {}".format(key,
ModuleBase.MODULE_TYPE_SUPERVISOR,
ModuleBase.MODULE_TYPE_LINE,
ModuleBase.MODULE_TYPE_FABRIC))
ModuleBase.MODULE_TYPE_FABRIC,
ModuleBase.MODULE_TYPE_DPU))
return

module_index = try_get(self.chassis.get_module_index, key, default=INVALID_MODULE_INDEX)
Expand All @@ -166,6 +168,8 @@ class ModuleConfigUpdater(logger.Logger):
# Setting the module to administratively up/down state
self.log_info("Changing module {} to admin {} state".format(key, 'DOWN' if admin_state == MODULE_ADMIN_DOWN else 'UP'))
try_get(self.chassis.get_module(module_index).set_admin_state, admin_state, default=False)
else:
self.log_warning("Invalid admin_state value: {}".format(admin_state))

#
# Module Updater ==============================================================
Expand All @@ -184,7 +188,7 @@ class ModuleUpdater(logger.Logger):
self.chassis = chassis
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that instead of modifying the existing ModuleUpdater class, we should implement SmartSwitchModuleUpdater(ModuleUpdater). SmartSwitchModuleUpdater should be derived from ModuleUpdater and overwrite the methods that should behave differently for the Smart Switch. This approach allows us to keep the original implementation untouched and guarantees full backward compatibility with the chassisd.

Copy link
Author

@rameshraghupathy rameshraghupathy Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will consider this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.my_slot = my_slot
self.supervisor_slot = supervisor_slot
self.num_modules = chassis.get_num_modules()
self.num_modules = self.chassis.get_num_modules()
# Connect to STATE_DB and create chassis info tables
state_db = daemon_base.db_connect("STATE_DB")
self.chassis_table = swsscommon.Table(state_db, CHASSIS_INFO_TABLE)
Expand All @@ -197,15 +201,17 @@ class ModuleUpdater(logger.Logger):
CHASSIS_MODULE_INFO_OPERSTATUS_FIELD]

self.chassis_state_db = daemon_base.db_connect("CHASSIS_STATE_DB")
if self._is_supervisor():
self.asic_table = swsscommon.Table(self.chassis_state_db,
CHASSIS_FABRIC_ASIC_INFO_TABLE)
else:
self.asic_table = swsscommon.Table(self.chassis_state_db,
CHASSIS_ASIC_INFO_TABLE)

if not self.chassis.is_smartswitch():
if self._is_supervisor():
self.asic_table = swsscommon.Table(self.chassis_state_db,
CHASSIS_FABRIC_ASIC_INFO_TABLE)
else:
self.asic_table = swsscommon.Table(self.chassis_state_db,
CHASSIS_ASIC_INFO_TABLE)

self.hostname_table = swsscommon.Table(self.chassis_state_db, CHASSIS_MODULE_HOSTNAME_TABLE)
self.module_reboot_table = swsscommon.Table(self.chassis_state_db, CHASSIS_MODULE_REBOOT_INFO_TABLE)
self.module_reboot_table = swsscommon.Table(self.chassis_state_db, CHASSIS_MODULE_REBOOT_INFO_TABLE)
self.down_modules = {}
self.chassis_app_db_clean_sha = None

Expand Down Expand Up @@ -237,7 +243,7 @@ class ModuleUpdater(logger.Logger):
self.chassis_table._del(CHASSIS_INFO_KEY_TEMPLATE.format(1))

if self.asic_table is not None:
if not self._is_supervisor():
if not self._is_supervisor() or self.chassis.is_smartswitch():
asics = list(self.asic_table.getKeys())
for asic in asics:
self.asic_table._del(asic)
Expand Down Expand Up @@ -280,10 +286,12 @@ class ModuleUpdater(logger.Logger):

if not key.startswith(ModuleBase.MODULE_TYPE_SUPERVISOR) and \
not key.startswith(ModuleBase.MODULE_TYPE_LINE) and \
not key.startswith(ModuleBase.MODULE_TYPE_DPU) and \
not key.startswith(ModuleBase.MODULE_TYPE_FABRIC):
self.log_error("Incorrect module-name {}. Should start with {} or {} or {}".format(key,
self.log_error("Incorrect module-name {}. Should start with {} or {} or {} or {}".format(key,
ModuleBase.MODULE_TYPE_SUPERVISOR,
ModuleBase.MODULE_TYPE_LINE,
ModuleBase.MODULE_TYPE_DPU,
ModuleBase.MODULE_TYPE_FABRIC))
continue

Expand All @@ -296,37 +304,37 @@ class ModuleUpdater(logger.Logger):
prev_status = self.get_module_current_status(key)
self.module_table.set(key, fvs)

# Construct key for down_modules dict. Example down_modules key format: LINE-CARD0|<hostname>
fvs = self.hostname_table.get(key)
if isinstance(fvs, list) and fvs[0] is True:
fvs = dict(fvs[-1])
hostname = fvs[CHASSIS_MODULE_INFO_HOSTNAME_FIELD]
down_module_key = key+'|'+hostname
else:
down_module_key = key+'|'

if module_info_dict[CHASSIS_MODULE_INFO_OPERSTATUS_FIELD] != str(ModuleBase.MODULE_STATUS_ONLINE):
if prev_status == ModuleBase.MODULE_STATUS_ONLINE:
rameshraghupathy marked this conversation as resolved.
Show resolved Hide resolved
notOnlineModules.append(key)
# Record the time when the module down was detected to track the
# module down time. Used for chassis db cleanup for all asics of the module if the module is down for a
# long time like 30 mins.
# All down modules including supervisor are added to the down modules dictionary. This is to help
# identifying module operational status change. But the clean up will not be attempted for supervisor

if down_module_key not in self.down_modules:
self.log_warning("Module {} went off-line!".format(key))
self.down_modules[down_module_key] = {}
self.down_modules[down_module_key]['down_time'] = time.time()
self.down_modules[down_module_key]['cleaned'] = False
continue
else:
# Module is operational. Remove it from down time tracking.
if down_module_key in self.down_modules:
self.log_notice("Module {} recovered on-line!".format(key))
del self.down_modules[down_module_key]
elif prev_status != ModuleBase.MODULE_STATUS_ONLINE:
rameshraghupathy marked this conversation as resolved.
Show resolved Hide resolved
self.log_notice("Module {} is on-line!".format(key))
if not self.chassis.is_smartswitch():
# Construct key for down_modules dict. Example down_modules key format: LINE-CARD0|<hostname>
fvs = self.hostname_table.get(key)
if isinstance(fvs, list) and fvs[0] is True:
fvs = dict(fvs[-1])
hostname = fvs[CHASSIS_MODULE_INFO_HOSTNAME_FIELD]
down_module_key = key+'|'+hostname
else:
down_module_key = key+'|'

if module_info_dict[CHASSIS_MODULE_INFO_OPERSTATUS_FIELD] != str(ModuleBase.MODULE_STATUS_ONLINE):
if prev_status == ModuleBase.MODULE_STATUS_ONLINE:
notOnlineModules.append(key)
# Record the time when the module down was detected to track the
# module down time. Used for chassis db cleanup for all asics of the module if the module is down for a
# long time like 30 mins.
# All down modules including supervisor are added to the down modules dictionary. This is to help
# identifying module operational status change. But the clean up will not be attempted for supervisor
if down_module_key not in self.down_modules:
self.log_warning("Module {} went off-line!".format(key))
self.down_modules[down_module_key] = {}
self.down_modules[down_module_key]['down_time'] = time.time()
self.down_modules[down_module_key]['cleaned'] = False
continue
else:
# Module is operational. Remove it from down time tracking.
if down_module_key in self.down_modules:
self.log_notice("Module {} recovered on-line!".format(key))
del self.down_modules[down_module_key]
elif prev_status != ModuleBase.MODULE_STATUS_ONLINE:
self.log_notice("Module {} is on-line!".format(key))

module_cfg_status = self.get_module_admin_status(key)

Expand All @@ -343,24 +351,25 @@ class ModuleUpdater(logger.Logger):
(CHASSIS_ASIC_ID_IN_MODULE_FIELD, str(asic_id))])
self.asic_table.set(asic_key, asic_fvs)

# In line card push the hostname of the module and num_asics to the chassis state db.
# The hostname is used as key to access chassis app db entries
if not self._is_supervisor():
hostname_key = "{}{}".format(ModuleBase.MODULE_TYPE_LINE, int(self.my_slot) - 1)
hostname = try_get(device_info.get_hostname, default="None")
hostname_fvs = swsscommon.FieldValuePairs([(CHASSIS_MODULE_INFO_SLOT_FIELD, str(self.my_slot)),
(CHASSIS_MODULE_INFO_HOSTNAME_FIELD, hostname),
(CHASSIS_MODULE_INFO_NUM_ASICS_FIELD, str(len(module_info_dict[CHASSIS_MODULE_INFO_ASICS])))])
self.hostname_table.set(hostname_key, hostname_fvs)

# Asics that are on the "not online" modules need to be cleaned up
asics = list(self.asic_table.getKeys())
for asic in asics:
fvs = self.asic_table.get(asic)
if isinstance(fvs, list):
fvs = dict(fvs[-1])
if fvs[CHASSIS_MODULE_INFO_NAME_FIELD] in notOnlineModules:
self.asic_table._del(asic)
if not self.chassis.is_smartswitch():
# In line card push the hostname of the module and num_asics to the chassis state db.
# The hostname is used as key to access chassis app db entries
if not self._is_supervisor():
hostname_key = "{}{}".format(ModuleBase.MODULE_TYPE_LINE, int(self.my_slot) - 1)
hostname = try_get(device_info.get_hostname, default="None")
hostname_fvs = swsscommon.FieldValuePairs([(CHASSIS_MODULE_INFO_SLOT_FIELD, str(self.my_slot)),
(CHASSIS_MODULE_INFO_HOSTNAME_FIELD, hostname),
(CHASSIS_MODULE_INFO_NUM_ASICS_FIELD, str(len(module_info_dict[CHASSIS_MODULE_INFO_ASICS])))])
self.hostname_table.set(hostname_key, hostname_fvs)

# Asics that are on the "not online" modules need to be cleaned up
asics = list(self.asic_table.getKeys())
for asic in asics:
fvs = self.asic_table.get(asic)
if isinstance(fvs, list):
fvs = dict(fvs[-1])
if fvs[CHASSIS_MODULE_INFO_NAME_FIELD] in notOnlineModules:
self.asic_table._del(asic)

def _get_module_info(self, module_index):
"""
Expand All @@ -387,6 +396,9 @@ class ModuleUpdater(logger.Logger):
return module_info_dict

def _is_supervisor(self):
if self.chassis.is_smartswitch():
return False

if self.my_slot == self.supervisor_slot:
return True
else:
Expand Down Expand Up @@ -424,19 +436,22 @@ class ModuleUpdater(logger.Logger):
index = -1
for module in self.chassis.get_all_modules():
index += 1
# Skip fabric cards
if module.get_type() == ModuleBase.MODULE_TYPE_FABRIC:
continue

if self._is_supervisor():
# On supervisor skip checking for supervisor
if module.get_slot() == self.supervisor_slot:
continue
else:
# On line-card check only supervisor
if module.get_slot() != self.supervisor_slot:
# Skip for SmartSwitch
if not self.chassis.is_smartswitch():
# Skip fabric cards
if module.get_type() == ModuleBase.MODULE_TYPE_FABRIC:
continue

if self._is_supervisor():
# On supervisor skip checking for supervisor
if module.get_slot() == self.supervisor_slot:
continue
else:
# On line-card check only supervisor
if module.get_slot() != self.supervisor_slot:
continue

module_key = try_get(module.get_name, default='MODULE {}'.format(index))
midplane_ip = try_get(module.get_midplane_ip, default=INVALID_IP)
midplane_access = try_get(module.is_midplane_reachable, default=False)
Expand Down Expand Up @@ -543,7 +558,7 @@ class ModuleUpdater(logger.Logger):


def module_down_chassis_db_cleanup(self):
if self._is_supervisor() == False:
if self._is_supervisor() == False or self.chassis.is_smartswitch():
return
time_now = time.time()
for module in self.down_modules:
Expand Down Expand Up @@ -571,38 +586,42 @@ class ConfigManagerTask(ProcessTaskBase):
self.logger = logger.Logger(SYSLOG_IDENTIFIER)

def task_worker(self):
self.config_updater = ModuleConfigUpdater(SYSLOG_IDENTIFIER, platform_chassis)
config_db = daemon_base.db_connect("CONFIG_DB")
try:
self.config_updater = ModuleConfigUpdater(SYSLOG_IDENTIFIER, platform_chassis)
config_db = daemon_base.db_connect("CONFIG_DB")

# Subscribe to CHASSIS_MODULE table notifications in the Config DB
sel = swsscommon.Select()
sst = swsscommon.SubscriberStateTable(config_db, CHASSIS_CFG_TABLE)
sel.addSelectable(sst)

# Listen indefinitely for changes to the CFG_CHASSIS_MODULE_TABLE table in the Config DB
while True:
# Use timeout to prevent ignoring the signals we want to handle
# in signal_handler() (e.g. SIGTERM for graceful shutdown)
(state, c) = sel.select(SELECT_TIMEOUT)

if state == swsscommon.Select.TIMEOUT:
# Do not flood log when select times out
continue
if state != swsscommon.Select.OBJECT:
self.logger.log_warning("sel.select() did not return swsscommon.Select.OBJECT")
continue

# Subscribe to CHASSIS_MODULE table notifications in the Config DB
sel = swsscommon.Select()
sst = swsscommon.SubscriberStateTable(config_db, CHASSIS_CFG_TABLE)
sel.addSelectable(sst)

# Listen indefinitely for changes to the CFG_CHASSIS_MODULE_TABLE table in the Config DB
while True:
# Use timeout to prevent ignoring the signals we want to handle
# in signal_handler() (e.g. SIGTERM for graceful shutdown)
(state, c) = sel.select(SELECT_TIMEOUT)

if state == swsscommon.Select.TIMEOUT:
# Do not flood log when select times out
continue
if state != swsscommon.Select.OBJECT:
self.logger.log_warning("sel.select() did not return swsscommon.Select.OBJECT")
continue

(key, op, fvp) = sst.pop()

if op == 'SET':
admin_state = MODULE_ADMIN_DOWN
elif op == 'DEL':
admin_state = MODULE_ADMIN_UP
else:
continue
(key, op, fvp) = sst.pop()

if op == 'SET':
admin_state = MODULE_ADMIN_DOWN
elif op == 'DEL':
admin_state = MODULE_ADMIN_UP
else:
continue

self.config_updater.module_config_update(key, admin_state)
self.config_updater.module_config_update(key, admin_state)

except Exception as e:
# Log any exceptions that occur
self.logger.log_error("Exception in task_worker:", str(e))
#
# Daemon =======================================================================
#
Expand Down Expand Up @@ -645,23 +664,24 @@ class ChassisdDaemon(daemon_base.DaemonBase):
sys.exit(CHASSIS_LOAD_ERROR)

# Check for valid slot numbers
my_slot = try_get(platform_chassis.get_my_slot,
default=INVALID_SLOT)
supervisor_slot = try_get(platform_chassis.get_supervisor_slot,
default=INVALID_SLOT)

my_slot = try_get(platform_chassis.get_my_slot, default=INVALID_SLOT)
supervisor_slot = try_get(platform_chassis.get_supervisor_slot, default=INVALID_SLOT)

# Check if module list is populated
self.module_updater = ModuleUpdater(SYSLOG_IDENTIFIER, platform_chassis, my_slot, supervisor_slot)
self.module_updater.modules_num_update()

if not ModuleBase.MODULE_TYPE_DPU:
if ((self.module_updater.my_slot == INVALID_SLOT) or
(self.module_updater.supervisor_slot == INVALID_SLOT)):
self.log_error("Chassisd not supported for this platform")
sys.exit(CHASSIS_NOT_SUPPORTED)

if ((self.module_updater.my_slot == INVALID_SLOT) or
(self.module_updater.supervisor_slot == INVALID_SLOT)):
self.log_error("Chassisd not supported for this platform")
sys.exit(CHASSIS_NOT_SUPPORTED)
smartswitch = self.module_updater.chassis.is_smartswitch()
sup_slot = self.module_updater.supervisor_slot
my_slot = self.module_updater.my_slot

# Start configuration manager task on supervisor module
if self.module_updater.supervisor_slot == self.module_updater.my_slot:
if smartswitch or not smartswitch and sup_slot == my_slot:
config_manager = ConfigManagerTask()
config_manager.task_run()
else:
Expand Down
1 change: 1 addition & 0 deletions sonic-chassisd/tests/mock_module_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ class ModuleBase():
MODULE_TYPE_SUPERVISOR = "SUPERVISOR"
MODULE_TYPE_LINE = "LINE-CARD"
MODULE_TYPE_FABRIC = "FABRIC-CARD"
MODULE_TYPE_DPU = "DPU"

# Possible card status for modular chassis
# Module state is Empty if no module is inserted in the slot
Expand Down
Loading
Loading