-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCI/MSI: Prevent MSI hardware interrupt number truncation #14
Open
jamieNguyenNVIDIA
wants to merge
79
commits into
NVIDIA-BaseOS-6:main
Choose a base branch
from
jamieNguyenNVIDIA:msi-truncation
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
PCI/MSI: Prevent MSI hardware interrupt number truncation #14
jamieNguyenNVIDIA
wants to merge
79
commits into
NVIDIA-BaseOS-6:main
from
jamieNguyenNVIDIA:msi-truncation
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Ian May <[email protected]>
…dversion" This reverts commit 47d27f2. We need to revert this to avoid regressing any modules used in Jammy. Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2038972 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
… a pasid support BugLink: https://bugs.launchpad.net/bugs/2031320 When an iommu_domain is set to IOMMU_DOMAIN_IDENTITY, the driver would skip the allocation of a CD table and set the CONFIG field of the STE to STRTAB_STE_0_CFG_BYPASS. This works well for devices that only have one substream, i.e. PASID disabled. However, there could be a use case, for a pasid capable device, that allows bypassing the translation at the default substream while still enabling the pasid feature, which means the driver should not skip the allocation of a CD table nor simply bypass the CONFIG field. Instead, the S1DSS field should be set to STRTAB_STE_1_S1DSS_BYPASS and the SHCFG field should be set to STRTAB_STE_1_SHCFG_INCOMING. Add s1dss in struct arm_smmu_s1_cfg, to allow a configuration in the finalise() to support this use case. Also, according to "13.5 Summary of attribute/permission configuration fields" in the reference manual, the SHCFG field value is irrelevant. So, set the SHCFG field of the STE always to STRTAB_STE_1_SHCFG_INCOMING for simplification. Signed-off-by: Nicolin Chen <[email protected]> Reviewed-by: Pritesh Raithatha <[email protected]> Acked-by: Jamie Nguyen <[email protected]> Acked-by: Nicolin Chen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2045591 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
…rnel BugLink: https://bugs.launchpad.net/bugs/2043132 With this change, the NVMe and NVMeOF driver would be enabled to support GPUDirectStorage(GDS). The change is around nvme/nvme rdma map_data() and unmap_data(), where the IO request is first intercepted to check for GDS pages and if it is a GDS page then the request is served by GDS driver component called nvidia-fs, else the request would be served by the standard NVMe driver code Signed-off-by: Sourab Gupta <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2043132 With this change, the NFS driver would be enabled to support GPUDirectStorage(GDS). The change is around frwr_map and frwr_unmap in the NFS driver, where the IO request is first intercepted to check for GDS pages and if it is a GDS page then the request is served by GDS driver component called nvidia-fs, else the request would be served by the standard NFS driver code. Signed-off-by: Sourab Gupta <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Brad Figg <[email protected]>
This reverts commit b2638e9. This feature is not targeted for Jammy. Signed-off-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Signed-off-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Add support for exposing rprovides data for standalone modules too. Switch to exposing provides as a shared debian/substvar file and use that in the templates. Signed-off-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Signed-off-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Signed-off-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
…e_range BugLink: https://bugs.launchpad.net/bugs/2048966 When running an SVA case, the following soft lockup is triggered: -------------------------------------------------------------------- watchdog: BUG: soft lockup - CPU#244 stuck for 26s! pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50 lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50 sp : ffff8000d83ef290 x29: ffff8000d83ef290 x28: 000000003b9aca00 x27: 0000000000000000 x26: ffff8000d83ef3c0 x25: da86c0812194a0e8 x24: 0000000000000000 x23: 0000000000000040 x22: ffff8000d83ef340 x21: ffff0000c63980c0 x20: 0000000000000001 x19: ffff0000c6398080 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffff3000b4a3bbb0 x14: ffff3000b4a30888 x13: ffff3000b4a3cf60 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc08120e4d6bc x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000048cfa x5 : 0000000000000000 x4 : 0000000000000001 x3 : 000000000000000a x2 : 0000000080000000 x1 : 0000000000000000 x0 : 0000000000000001 Call trace: arm_smmu_cmdq_issue_cmdlist+0x178/0xa50 __arm_smmu_tlb_inv_range+0x118/0x254 arm_smmu_tlb_inv_range_asid+0x6c/0x130 arm_smmu_mm_invalidate_range+0xa0/0xa4 __mmu_notifier_invalidate_range_end+0x88/0x120 unmap_vmas+0x194/0x1e0 unmap_region+0xb4/0x144 do_mas_align_munmap+0x290/0x490 do_mas_munmap+0xbc/0x124 __vm_munmap+0xa8/0x19c __arm64_sys_munmap+0x28/0x50 invoke_syscall+0x78/0x11c el0_svc_common.constprop.0+0x58/0x1c0 do_el0_svc+0x34/0x60 el0_svc+0x2c/0xd4 el0t_64_sync_handler+0x114/0x140 el0t_64_sync+0x1a4/0x1a8 -------------------------------------------------------------------- Note that since 6.6-rc1 the arm_smmu_mm_invalidate_range above is renamed to "arm_smmu_mm_arch_invalidate_secondary_tlbs", yet the problem remains. The commit 06ff87b ("arm64: mm: remove unused functions and variable protoypes") fixed a similar lockup on the CPU MMU side. Yet, it can occur to SMMU too, since arm_smmu_mm_arch_invalidate_secondary_tlbs() is called typically next to MMU tlb flush function, e.g. tlb_flush_mmu_tlbonly { tlb_flush { __flush_tlb_range { // check MAX_TLBI_OPS } } mmu_notifier_arch_invalidate_secondary_tlbs { arm_smmu_mm_arch_invalidate_secondary_tlbs { // does not check MAX_TLBI_OPS } } } Clone a CMDQ_MAX_TLBI_OPS from the MAX_TLBI_OPS in tlbflush.h, since in an SVA case SMMU uses the CPU page table, so it makes sense to align with the tlbflush code. Then, replace per-page TLBI commands with a single per-asid TLBI command, if the request size hits this threshold. Signed-off-by: Nicolin Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> [backported from commit d5afb4b] [ian-may: change made to function arm_smmu_mm_invalidate_range] Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2048815 TPM devices may insert wait state on last clock cycle of ADDR phase. For SPI controllers that support full-duplex transfers, this can be detected using software by reading the MISO line. For SPI controllers that only support half-duplex transfers, such as the Tegra QSPI, it is not possible to detect the wait signal from software. The QSPI controller in Tegra234 and Tegra241 implement hardware detection of the wait signal which can be enabled in the controller for TPM devices. The current TPM TIS driver only supports software detection of the wait signal. To support SPI controllers that use hardware to detect the wait signal, add the function tpm_tis_spi_transfer_half() and move the existing code for software based detection into a function called tpm_tis_spi_transfer_full(). SPI controllers that only support half-duplex transfers will always call tpm_tis_spi_transfer_half() because they cannot support software based detection. The bit SPI_TPM_HW_FLOW is set to indicate to the SPI controller that hardware detection is required and it is the responsibility of the SPI controller driver to determine if this is supported or not. For hardware flow control, CMD-ADDR-DATA messages are combined into a single message where as for software flow control exiting method of CMD-ADDR in a message and DATA in another is followed. [jarkko: Fixed the function names to match the code change, and the tag in the short summary.] Signed-off-by: Krishna Yarlagadda <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]> (cherry picked from commit a86a42a) Acked-by: Jamie Nguyen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2047983 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2049537 Add support of "Thermal fast Sampling Period (_TFP)" for passive cooling. As per the ACPI specification (ACPI 6.5, Section 11.4.17 "_TFP (Thermal fast Sampling Period)", _TFP overrides _TSP ("Thermal Sampling Period" if both are present in a Thermal zone. Signed-off-by: Jeff Brasen <[email protected]> Co-developed-by: Sumit Gupta <[email protected]> Signed-off-by: Sumit Gupta <[email protected]> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Jamie Nguyen <[email protected]> Acked-by: Sumit Gupta <[email protected]> (backported from commit a2ee758 linux-next) Acked-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2055581 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
…s (main/2024.03.04) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
…ovides When nvidia-fs-dkms is available as a dkms package, we want to default to using the signed modules if possible. Adding a version number for the nvidia-fs modules package enables the inbox modules to be selected over an equivalent dkms version. Ignore: yes Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2059703 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2061930 There are systems in production that don't have firmware that supports coresight_etm4x. Instead of removing completely, blacklist coresight_etm4x so systems with the correct firmware can use the module. Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2063373 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
…rnel-versions (main/s2024.03.04) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Ignore: yes Signed-off-by: Jacob Martin <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2063578 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>
Signed-off-by: Jacob Martin <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2063461 N2 r0p3 doesn't require the workaround [1], so gating on (#slots - 5) no longer works because all N2s have 5 slots. Use the new expression builtin that allows calling strcmp_cpuid_str() and comparing CPUIDs in metric formulas. In this case, the commented formula looks like this: strcmp_cpuid_str(0x410fd493) # greater than or equal to N2 r0p3 | strcmp_cpuid_str(0x410fd490) ^ 1 # OR NOT any version of N2 [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-n2-r0p3.json Signed-off-by: James Clark <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> (cherry picked from commit d43f549) Signed-off-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Noah Wager <[email protected]>
…rm telemetry repo BugLink: https://bugs.launchpad.net/bugs/2063461 Apart from some slight naming and grouping differences, the new metrics are functionally the same as the existing ones. Any missing metrics were manually appended to the end of the auto generated file. For the events, the new data includes descriptions that may have product specific details and new groupings that will be consistent with other products. After generating the metrics from the telemetry repo [1], the following manual steps were performed: * Change the topdown expressions to compare on CPUID and use #slots so that the same data can be shared between N2 and V2. Apart from these modifications, the expressions now match more closely with the Arm telemetry data which will hopefully make future updates easier. * Append some metrics from the old N2/V2 data that aren't present in the telemetry data. These will possibly be added to the telemetry-solution repo at a later time: l3d_cache_mpki, l3d_cache_miss_rate, branch_pki, ipc_rate, spec_ipc, retired_rate, wasted_rate, branch_immed_spec_rate, branch_return_spec_rate, branch_indirect_spec_rate [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-n2.json Signed-off-by: James Clark <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> (cherry picked from commit 4473949) Signed-off-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Noah Wager <[email protected]>
…ion before access."" BugLink: https://bugs.launchpad.net/bugs/2064549 This reverts commit 9cfa409. This effectively restores the functionality added by: b2b56a1 ("gpio: tegra186: Check GPIO pin permission before access." We restore this and, now that the patch has been upstreamed, will apply the proper fix atop. Signed-off-by: Jamie Nguyen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2064549 The controller has several register bits describing access control information for a given GPIO pin. When SCR_SEC_[R|W]EN is unset, it means we have full read/write access to all the registers for given GPIO pin. When SCR_SEC[R|W]EN is set, it means we need to further check the accompanying SCR_SEC_G1[R|W] bit to determine read/write access to all the registers for given GPIO pin. This check was previously declaring that a GPIO pin was accessible only if either of the following conditions were met: - SCR_SEC_REN + SCR_SEC_WEN both set or - SCR_SEC_REN + SCR_SEC_WEN both set and SCR_SEC_G1R + SCR_SEC_G1W both set Update the check to properly handle cases where only one of SCR_SEC_REN or SCR_SEC_WEN is set. Fixes: b2b56a1 ("gpio: tegra186: Check GPIO pin permission before access.") Signed-off-by: Prathamesh Shete <[email protected]> Acked-by: Thierry Reding <[email protected]> (cherry-picked from commit d806f474a9a7993648a2c70642ee129316d8deff linux-next) Signed-off-by: Jamie Nguyen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
While calculating the hardware interrupt number for a MSI interrupt, the higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI domain number gets truncated because of the shifted value casting to return type of pci_domain_nr() which is 'int'. This for example is resulting in same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0. To address this cast the PCI domain number to 'irq_hw_number_t' before left shifting it to calculate the hardware interrupt number. Please note that this fixes the issue only on 64-bit systems and doesn't change the behavior for 32-bit systems i.e. the 32-bit systems continue to have the issue. Since the issue surfaces only if there are too many PCIe controllers in the system which usually is the case in modern server systems and they don't tend to run 32-bit kernels. Fixes: 3878eae ("PCI/MSI: Enhance core to support hierarchy irqdomain") Signed-off-by: Vidya Sagar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Shanker Donthineni <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] (cherry-picked from commit db744dd) Signed-off-by: Jamie Nguyen <[email protected]>
Testing done: 6.5.0-1019:
6.5.0-1019 with patch applied:
|
nvidia-bfigg
force-pushed
the
main
branch
2 times, most recently
from
May 31, 2024 18:56
11b7036
to
0e932dd
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While calculating the hardware interrupt number for a MSI interrupt, the higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI domain number gets truncated because of the shifted value casting to return type of pci_domain_nr() which is 'int'. This for example is resulting in same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.
To address this cast the PCI domain number to 'irq_hw_number_t' before left shifting it to calculate the hardware interrupt number.
Please note that this fixes the issue only on 64-bit systems and doesn't change the behavior for 32-bit systems i.e. the 32-bit systems continue to have the issue. Since the issue surfaces only if there are too many PCIe controllers in the system which usually is the case in modern server systems and they don't tend to run 32-bit kernels.
Fixes: 3878eae ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Tested-by: Shanker Donthineni [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected] (cherry-picked from commit db744dd)