[PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores #17

jamieNguyenNVIDIA · 2024-07-01T19:37:10Z

This cherry-picks the following series: https://lore.kernel.org/all/[email protected]/

It contains the following six commits:

IB/mlx5: Use __iowrite64_copy() for write combining stores
net: hns3: Remove io_stop_wc() calls after __iowrite64_copy()
arm64/io: Provide a WC friendly __iowriteXX_copy()
s390: Stop using weak symbols for __iowrite64_copy()
s390: Implement __iowrite32_copy()
x86: Stop using weak symbols for __iowrite32_copy()

Signed-off-by: Ian May <[email protected]>

…dversion" This reverts commit 47d27f2. We need to revert this to avoid regressing any modules used in Jammy. Signed-off-by: Ian May <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Ian May <[email protected]>

Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Ian May <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2038972 Properties: no-test-build Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Ian May <[email protected]>

Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Paolo Pisati <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2046137 Properties: no-test-build Signed-off-by: Paolo Pisati <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Paolo Pisati <[email protected]>

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2055128 Properties: no-test-build Signed-off-by: Andrea Righi <[email protected]>

…ain/d2024.02.07) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Andrea Righi <[email protected]>

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2069777 Use u8, u32 and u64 instead of respective uintN_t types. Remove unnecessary newlines for function argument lists. Signed-off-by: Shravan Kumar Ramani <[email protected]> Link: https://lore.kernel.org/r/39be055af3506ce6f843d11e45d71620f2a96e26.1707808180.git.shravankr@nvidia.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit fd23023) Signed-off-by: David Thompson <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2069777 Use unsigned integer types for register values and array indices. Use %u instead of %d accordingly. Signed-off-by: Shravan Kumar Ramani <[email protected]> Link: https://lore.kernel.org/r/d8548c70339a29258a906b2b518e5c48f669795c.1707808180.git.shravankr@nvidia.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit 1ae9ffd) Signed-off-by: David Thompson <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>

…ptional BugLink: https://bugs.launchpad.net/bugs/2069777 The mlxbf_pmc_event_list() function returns a pointer to an array of supported events and the array size. The array size is returned via a pointer passed as an argument, which is mandatory. However, we want to be able to use mlxbf_pmc_event_list() just to check if a block name is implemented/supported. For this usage passing the size argument is not necessary so let's make it optional. Signed-off-by: Luiz Capitulino <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/182de8ec6b9c33152f2ba6b248c35b0311abf5e4.1708635408.git.luizcap@redhat.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit 0d46439) Signed-off-by: David Thompson <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2069777 Currently, the driver has two behaviors to deal with new & unsupported performance blocks reported by the firmware: 1. For register and unknown block types, the driver will fail to load with the following error message: [ 4510.956369] mlxbf-pmc: probe of MLNXBFD2:00 failed with error -22 2. For counter and crspace blocks, the driver will load and sysfs files will be created but getting the contents of event_list or trying to setup the counter will fail Instead, let's ignore and log unsupported blocks. This means the driver will always load and unsupported blocks will never show up in sysfs. Signed-off-by: Luiz Capitulino <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/f8e2e6210b43e825b69824b420c801cd513d401d.1708635408.git.luizcap@redhat.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit c0459ee) Signed-off-by: David Thompson <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2069777 These need to be signed for the error handling to work. The mlxbf_pmc_get_event_num() function returns int so int type is correct. Fixes: 1ae9ffd ("platform/mellanox: mlxbf-pmc: Cleanup signed/unsigned mix-up") Signed-off-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit 7c8772f) Signed-off-by: David Thompson <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>

Start switching iomap_copy routines over to use #define and arch provided inline/macro functions instead of weak symbols. Inline functions allow more compiler optimization and this is often a driver hot path. x86 has the only weak implementation for __iowrite32_copy(), so replace it with a static inline containing the same single instruction inline assembly. The compiler will generate the "mov edx,ecx" in a more optimal way. Remove iomap_copy_64.S Link: https://lore.kernel.org/r/[email protected] Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit 20516d6) Signed-off-by: Jamie Nguyen <[email protected]>

It is trivial to implement an inline to do this, so provide it in the s390 headers. Like the 64 bit version it should just invoke zpci_memcpy_toio() with the correct size. Link: https://lore.kernel.org/r/[email protected] Acked-by: Niklas Schnelle <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit 6ae798c) Signed-off-by: Jamie Nguyen <[email protected]>

Complete switching the __iowriteXX_copy() routines over to use #define and arch provided inline/macro functions instead of weak symbols. S390 has an implementation that simply calls another memcpy function. Inline this so the callers don't have to do two jumps. Link: https://lore.kernel.org/r/[email protected] Acked-by: Niklas Schnelle <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit e7bc47b) Signed-off-by: Jamie Nguyen <[email protected]>

The kernel provides driver support for using write combining IO memory through the __iowriteXX_copy() API which is commonly used as an optional optimization to generate 16/32/64 byte MemWr TLPs in a PCIe environment. iomap_copy.c provides a generic implementation as a simple 4/8 byte at a time copy loop that has worked well with past ARM64 CPUs, giving a high frequency of large TLPs being successfully formed. However modern ARM64 CPUs are quite sensitive to how the write combining CPU HW is operated and a compiler generated loop with intermixed load/store is not sufficient to frequently generate a large TLP. The CPUs would like to see the entire TLP generated by consecutive store instructions from registers. Compilers like gcc tend to intermix loads and stores and have poor code generation, in part, due to the ARM64 situation that writeq() does not codegen anything other than "[xN]". However even with that resolved compilers like clang still do not have good code generation. This means on modern ARM64 CPUs the rate at which __iowriteXX_copy() successfully generates large TLPs is very small (less than 1 in 10,000) tries), to the point that the use of WC is pointless. Implement __iowrite32/64_copy() specifically for ARM64 and use inline assembly to build consecutive blocks of STR instructions. Provide direct support for 64/32/16 large TLP generation in this manner. Optimize for common constant lengths so that the compiler can directly inline the store blocks. This brings the frequency of large TLP generation up to a high level that is comparable with older CPU generations. As the __iowriteXX_copy() family of APIs is intended for use with WC incorporate the DGH hint directly into the function. Link: https://lore.kernel.org/r/[email protected] Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit ead7911) Signed-off-by: Jamie Nguyen <[email protected]>

Now that the ARM64 arch implementation does the DGH as part of __iowrite64_copy() there is no reason to open code this in drivers. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jijie Shao<[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit 2b7a5e1) Signed-off-by: Jamie Nguyen <[email protected]>

mlx5 has a built in self-test at driver startup to evaluate if the platform supports write combining to generate a 64 byte PCIe TLP or not. This has proven necessary because a lot of common scenarios end up with broken write combining (especially inside virtual machines) and there is other way to learn this information. This self test has been consistently failing on new ARM64 CPU designs (specifically with NVIDIA Grace's implementation of Neoverse V2). The C loop around writeq() generates some pretty terrible ARM64 assembly, but historically this has worked on a lot of existing ARM64 CPUs till now. We see it succeed about 1 time in 10,000 on the worst effected systems. The CPU architects speculate that the load instructions interspersed with the stores makes the WC buffers statistically flush too often and thus the generation of large TLPs becomes infrequent. This makes the boot up test unreliable in that it indicates no write-combining, however userspace would be fine since it uses a ST4 instruction. Further, S390 has similar issues where only the special zpci_memcpy_toio() will actually generate large TLPs, and the open coded loop does not trigger it at all. Fix both ARM64 and S390 by switching to __iowrite64_copy() which now provides architecture specific variants that have a high change of generating a large TLP with write combining. x86 continues to use a similar writeq loop in the generate __iowrite64_copy(). Fixes: 11f552e ("IB/mlx5: Test write combining support") Link: https://lore.kernel.org/r/[email protected] Tested-by: Niklas Schnelle <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> (cherry picked from commit ef30228) Signed-off-by: Jamie Nguyen <[email protected]>

jamieNguyenNVIDIA · 2024-07-01T19:40:28Z

Testing done:

Enable dynamic debug on the mlx5_ib module

cat <<EOF | sudo tee /etc/modprobe.d/mlx5_ib.conf
options mlx5_ib dyndbg
EOF

Repeatedly unload + load mlx5_ib and observe that Write-Combining is always supported

sudo rmmod mlx5_ib; sudo modprobe mlx5_ib; sudo dmesg | grep Write-Combining | tail -4
[  800.218550] infiniband mlx5_5: mlx5_ib_enable_driver:3844:(pid 32873): Write-Combining supported
[  801.762868] infiniband mlx5_6: mlx5_ib_enable_driver:3844:(pid 32873): Write-Combining supported
[  803.310380] infiniband mlx5_7: mlx5_ib_enable_driver:3844:(pid 32873): Write-Combining supported
[  804.853160] infiniband mlx5_8: mlx5_ib_enable_driver:3844:(pid 32873): Write-Combining supported
[  806.393813] infiniband mlx5_9: mlx5_ib_enable_driver:3844:(pid 32873): Write-Combining supported

Without this series applied, this same tested would oftentimes report that Write-Combining was not supported:

sudo rmmod mlx5_ib; sudo modprobe mlx5_ib; sudo dmesg | grep Write-Combining | tail -4
[  205.892380] infiniband mlx5_0: mlx5_ib_enable_driver:3820:(pid 2991): Write-Combining supported
[  205.990040] infiniband mlx5_1: mlx5_ib_enable_driver:3820:(pid 2991): Write-Combining not supported
[  206.028820] infiniband mlx5_2: mlx5_ib_enable_driver:3820:(pid 2991): Write-Combining not supported
[  206.127483] infiniband mlx5_3: mlx5_ib_enable_driver:3820:(pid 2991): Write-Combining not supported

BugLink: https://bugs.launchpad.net/bugs/2073603 [ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 #6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 #7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 #8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 #9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 #10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 #11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 #12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 #13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 #14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 #15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 #16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 #17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 #18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 #19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 #20 0x55c9400e53ad in main tools/perf/perf.c:561 #21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 #23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>

A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>

iter_finish_branch_entry() doesn't put the branch_info from/to map elements creating memory leaks. This can be seen with: ``` $ perf record -e cycles -b perf test -w noploop $ perf report -D ... Direct leak of 984344 byte(s) in 123043 object(s) allocated from: #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x564d3400d10b in map__get util/map.h:186 #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981 #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151 #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898 #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238 #6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334 #7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655 #8 0x564d3403ba52 in do_flush util/ordered-events.c:245 #9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324 #10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708 #11 0x564d34032480 in perf_session__process_event util/session.c:1877 #12 0x564d340336ad in reader__read_event util/session.c:2399 #13 0x564d34033fdc in reader__process_events util/session.c:2448 #14 0x564d34033fdc in __perf_session__process_events util/session.c:2495 #15 0x564d34033fdc in perf_session__process_events util/session.c:2661 #16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065 #17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805 #18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350 #19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403 #20 0x564d33cdd827 in run_argv tools/perf/perf.c:447 #21 0x564d33cdd827 in main tools/perf/perf.c:561 ... ``` Clearing up the map_symbols properly creates maps reference count issues so resolve those. Resolving this issue doesn't improve peak heap consumption for the test above. Committer testing: $ sudo dnf install libasan $ make -k CORESIGHT=1 EXTRA_CFLAGS="-fsanitize=address" CC=clang O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Yanteng Si <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 #1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 #2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 #3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 #4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 #5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 #6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 #7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 #8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 #9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

During noirq suspend phase the Raspberry Pi power driver suffer of firmware property timeouts. The reason is that the IRQ of the underlying BCM2835 mailbox is disabled and rpi_firmware_property_list() will always run into a timeout [1]. Since the VideoCore side isn't consider as a wakeup source, set the IRQF_NO_SUSPEND flag for the mailbox IRQ in order to keep it enabled during suspend-resume cycle. [1] PM: late suspend of devices complete after 1.754 msecs WARNING: CPU: 0 PID: 438 at drivers/firmware/raspberrypi.c:128 rpi_firmware_property_list+0x204/0x22c Firmware transaction 0x00028001 timeout Modules linked in: CPU: 0 PID: 438 Comm: bash Tainted: G C 6.9.3-dirty #17 Hardware name: BCM2835 Call trace: unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x34/0x44 dump_stack_lvl from __warn+0x88/0xec __warn from warn_slowpath_fmt+0x7c/0xb0 warn_slowpath_fmt from rpi_firmware_property_list+0x204/0x22c rpi_firmware_property_list from rpi_firmware_property+0x68/0x8c rpi_firmware_property from rpi_firmware_set_power+0x54/0xc0 rpi_firmware_set_power from _genpd_power_off+0xe4/0x148 _genpd_power_off from genpd_sync_power_off+0x7c/0x11c genpd_sync_power_off from genpd_finish_suspend+0xcc/0xe0 genpd_finish_suspend from dpm_run_callback+0x78/0xd0 dpm_run_callback from device_suspend_noirq+0xc0/0x238 device_suspend_noirq from dpm_suspend_noirq+0xb0/0x168 dpm_suspend_noirq from suspend_devices_and_enter+0x1b8/0x5ac suspend_devices_and_enter from pm_suspend+0x254/0x2e4 pm_suspend from state_store+0xa8/0xd4 state_store from kernfs_fop_write_iter+0x154/0x1a0 kernfs_fop_write_iter from vfs_write+0x12c/0x184 vfs_write from ksys_write+0x78/0xc0 ksys_write from ret_fast_syscall+0x0/0x54 Exception stack(0xcc93dfa8 to 0xcc93dff0) [...] PM: noirq suspend of devices complete after 3095.584 msecs Link: raspberrypi/firmware#1894 Fixes: 0bae6af ("mailbox: Enable BCM2835 mailbox support") Signed-off-by: Stefan Wahren <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jassi Brar <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2076435 commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>

ianmay81 and others added 30 commits June 21, 2024 14:00

UBUNTU: [Packaging] Initialize linux-nvidia-6.5

ba7bdd4

Signed-off-by: Ian May <[email protected]>

Revert "UBUNTU: SAUCE: modpost: support arbitrary symbol length in mo…

5a29106

…dversion" This reverts commit 47d27f2. We need to revert this to avoid regressing any modules used in Jammy. Signed-off-by: Ian May <[email protected]>

UBUNTU: [Packaging] update variants

ebb2a41

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>

UBUNTU: [Packaging] update Ubuntu.md

538d6dc

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>

UBUNTU: Start new release

fc0d870

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: [Config] nvidia-6.5: update annotations

0f07fff

Signed-off-by: Ian May <[email protected]>

UBUNTU: Ubuntu-nvidia-6.5-6.5.0-1001.1

a534454

Signed-off-by: Ian May <[email protected]>

UBUNTU: [Packaging] nvidia-6.5: disable rust support

ed5488d

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: Start new release

190a26f

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

a3896c6

BugLink: https://bugs.launchpad.net/bugs/2038972 Properties: no-test-build Signed-off-by: Ian May <[email protected]>

UBUNTU: [Config] nvidia-6.5: update annotations

c874b8a

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: Ubuntu-nvidia-6.5-6.5.0-1004.4

5bed993

Signed-off-by: Ian May <[email protected]>

UBUNTU: Start new release

7b797c9

Ignore: yes Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: rename debian.nvidia-6.6 to debian.nvidia

48fee65

UBUNTU: link-to-tracker: update tracking bug

1d05f82

BugLink: https://bugs.launchpad.net/bugs/2046137 Properties: no-test-build Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] update variants

7af2d65

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] update update.conf

9ffb14d

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] move to gcc-13 by default

9f438f8

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: rebase on Ubuntu-6.6.0-14.14

d094896

Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Config] updateconfigs following Ubuntu-6.6.0-14.14 rebase

c6a8556

Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: Ubuntu-nvidia-6.6.0-1001.1

74793c4

Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] move to linux 6.8

a5b9bc1

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: update dropped.txt

390cfa0

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: Start new release

7511d08

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

9a6f896

BugLink: https://bugs.launchpad.net/bugs/2055128 Properties: no-test-build Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: debian.nvidia/dkms-versions -- update from kernel-versions (m…

6b47b2e

…ain/d2024.02.07) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: [Packaging] add Rust build dependencies

4700a2d

Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: [Config] update annotations after rebase to v6.8

b2c3dd0

Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: [Packaging] clean ABI check files

cc5c1aa

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: Ubuntu-nvidia-6.8.0-1001.1

d983385

Signed-off-by: Andrea Righi <[email protected]>

jacobmartin0 and others added 12 commits June 21, 2024 14:23

UBUNTU: Ubuntu-nvidia-6.8.0-1009.9

1a43e9c

Signed-off-by: Jacob Martin <[email protected]>

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch 2 times, most recently from 085a4b3 to 47898a1 Compare July 18, 2024 15:01

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from 29686e1 to 3200be9 Compare August 10, 2024 15:00

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from 3200be9 to 6ad2527 Compare August 17, 2024 15:00

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from 6ad2527 to 9b809a2 Compare August 21, 2024 15:00

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from 9b809a2 to 7d88485 Compare September 3, 2024 15:01

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch 2 times, most recently from 5325aaa to d1895f3 Compare October 7, 2024 15:00

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from d1895f3 to b0724a3 Compare October 8, 2024 15:00

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from 296a1e0 to e688282 Compare October 18, 2024 15:00

jamieNguyenNVIDIA closed this by deleting the head repository Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores #17

[PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores #17

jamieNguyenNVIDIA commented Jul 1, 2024

jamieNguyenNVIDIA commented Jul 1, 2024

[PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores #17

[PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores #17

Conversation

jamieNguyenNVIDIA commented Jul 1, 2024

jamieNguyenNVIDIA commented Jul 1, 2024