Releases: Mellanox/hw-mgmt
V.7.0040.2000
================================================================================
- V.7.0040.2000
- Tue , 1 Oct 2024
-
New features
o Add support for SN2201_M – ES level quality
o Add support for Q3400 – GA level quality
o Add support for SN4280 – GA level quality
o Add support for N5110_LD - GA level Quality + TTM/ NSO/ Ariel-JSO - ES level quality
o Add support for fan_speed_tolerance sysfs attribute (in addition to mix/max)
o Add support for SPC-4 SN5400 P2C Forward airflow
o Add support for hw-mgmt package version to debug dump
o Add support for thermal thersholds attributes per asic (trip_crit /emergency /crit /norm) -
Bug fixes
#4020749: LK 6.1.94: HW Semaphore is not released errors in ISSU Tests
#3726402: (3700V-DC) pull out / in FAN's randomly you get multiple air directions F2B /B2F then all the FAN's start to to run on full / MAX speed
#4047305: SN2201 The reboot-cause is incorrect for Power off
#4048801: QM3400 SLT::920-9B31-RX-9M0-NS:wrong limit of MPS voltage
#4082431: Error on 10.7.147.44 (Q3400_RA) during running test: test_thermal_sweep Thermal sweep: asic failed: Expected: 122 Actual: 255
#4093394: 7.0040.1037: Module 'coretemp' loading moved to user-space, and not skipped over SimXo For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0040.2000_BR/recipes-kernel/linux/Patch_Status_Table.txt
-
Known issues and limitations:
o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU.
o Systems may show a message of WARNING kernel: … supply vcc not found, using dummy regulator"
o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their "-B" variants) require the following flag in kernel cmdline:
"acpi_enforce_resources=lax acpi=noirq".
================================================================================
V.7.0040.1036
Motivation:
Bug fixes:
#4042338 [LogAnalyzer] - [hw-management (opt_sonic)] '' juliet-140-mgmt2 systemd-modules-load[376]: Failed to insert module 'coretemp': No such device
#3896662 [log_analyzer]| WARNING kernel: [ 10.329953] cdc_subset: probe of 1-2.1:1.0 failed with error -22
#3995269 [log_analyzer]| ERR pmon#sensord: Error getting sensor data: mp2975/#16: Kernel interface error
#3851470 [simx|emulation] | The file of /var/run/hw-management/thermal/psu*_fan_dir do not exist
#4004253 [Juilet PO/TTM/NSO] missing comex voltmons
#3979351 [Juliet] system stuck on boot
#3988849 [Juliet] - thermal/fan_amb symlink typo mismatch
Known Issues
o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU
V.7.0040.1013
Motivation:
Bug fixes:
#4042294 {5400 / 5600 SPC4 new platform}install latest GA 5.10.0.0037 on new 5400 & 5600 platform all FAN's unit stuck on MAX rpm
#4050264 SLT:Black mamba:920-9B31-RX-9M0-NS:Thermal warning presented as INFO
Features:
Add support for new SN2201_M - new DNI’s alligator platform
Update SN2201 patch for busbar system
Add PDB board support for new SN2201, SKU:HI168 in hw-mgmt package (By DNI)
Known Issues
Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU
V.7.0030.4100
================================================================================
- V.7.0030.4100
- Wed, 31 Jul 2024
-
New features
o Update kernel 6.1.38 to 6.1.94 -
Bug fixes
Issue Title
#3649551 SN4700 : [Independent Module] | on r-leopard-41 with IM enabled, there was a thermal overload.
#3878328 SN4700 : Switch rebooted with "Thermal Overload" because ASIC thermal is not available
#3885405 TC: [Thermal Algorithm] | Blacklist is malfunctions
#3883147 TC: [Thermal Algorithm] | Counts errors even it was paused by black list
#3879220 SN3420 : Thermal control: increase PWM minimum speed (20%->25%) to work around fan state issue reported by smond
#3895891 SPC1: [systemctl is-system-running] | SPC1 stuck in starting state after config reload - System was not started – lmsensor dependency issue
#3948113 Switch is freezing after generating hw-mgmt dump few times in row
#3852236 ARM: Kernel oops symptoms after boot: Unable to handle kernel paging address xxx when BSP Drivers are used
NA msn5400 | msn5600 | sn4280 :TC: fix asic sensor mask in sensor_parameters
NA Fix vpd parser sanity check is done only for 'MLNX' fru types
NA Multi ASIC system: kernel config CONFIG_HOTPLUG_PCI_PCIE (kernel 6.1) is required to be disabled for the sw_reset on multi asic systems
NA TC: missing support of correct PWM calculation for systems with amb_{X} sensor count != 2
NA MSN4700 | MQM9520:Some PSU1 labels are incorrectly marked as PSU2.
NA Missing support for Kconfig per Kernel major version
NA vpd-parser :In case onie "Base MAC Address" filed ends with zero byte - vpd-parser cut last byte in output.o For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0030.4000_BR/recipes-kernel/linux/Patch_Status_Table.txt
-
Known issues and limitations:
o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU.
o Systems may show a message of WARNING kernel: … supply vcc not found, using dummy regulator"
o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their "-B" variants) require the following flag in kernel cmdline:
"acpi_enforce_resources=lax acpi=noirq".
================================================================================
V.7.0040.1102
This patch fixes the issue while shifting the bits
in pmbus register 0xBD. A right shift of 14 bits
would make the value as 0 and the check for BIT(13)
will always fail. This was causing the wrong scale
value of 5mv to be used (instead of 2.5mv) for vou2.
Bug #3962952
V.7.0040.1007
hw-mgmt: patches: Correct mp2891 voltage scaling
This patch fixes the issue while shifting the bits
in pmbus register 0xBD. A right shift of 14 bits
would make the value as 0 and the check for BIT(13)
will always fail. This was causing the wrong scale
value of 5mv to be used (instead of 2.5mv) for vou2.
Bug #3962952
V.7.0040.1101
N5110_LD ES V.7.0040.1101
V.7.0040.1100
Juliet ES release tag : V.7.0040.1100
V.7.0040.1006
moved
CONFIG_USB_NET_CDCETHER=m
CONFIG_HOTPLUG_PCI_PCIE=n
for 6.1 amd64 to be under downstream
V.7.0040.1005
Motivation:
Align kernel support up to 6.1.90
Bug fixes:
RM bug Description
#3981606 SN4280 | Remove reset_from_carrier reset reason
NA SN4280 | Wrong dpu prefix for voltmon temperature sensor attribute dpu_voltmon[1-2]_temp1_input
NA SN4280 | following #3981606 - total number of reset reasons