Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMA_INTERNAL_THREAD_AFFINITY doesn't run the vma internal thread on the specified core #1080

Open
1 of 2 tasks
boranby opened this issue Jul 10, 2024 · 5 comments
Open
1 of 2 tasks

Comments

@boranby
Copy link

boranby commented Jul 10, 2024

Subject

running with: LD_PRELOAD=libvma.so VMA_SPEC=latency VMA_INTERNAL_THREAD_AFFINITY=2 ./app
However, VMA internal thread run on the same core with the application. I tried to use bit-mask approach to set the affinity it also didn't work.

Issue type

  • Bug report
  • Feature request

Configuration:

  • Product version VMA-9.8.51
  • OS RHEL 9.4
  • OFED MLNX_OFED_LINUX-24.04-0.6.6.0
  • Hardware Mellanox MT2894 Family [ConnectX-6 Lx]

Actual behavior:

VMA_INTERNAL_THREAD_AFFINITY=2 doesn't have an impact on the core affinity of the vma process. It's running on the same core with the application thread. Causing context switches and hanging which impact the latency.

Expected behavior:

The recommended configuration is to run VMA internal thread on a different core than the application but on the same NUMA node. To achieve this VMA_INTERNAL_THREAD_AFFINITY should work as expected to pin the vma process to the core we want it to be.

Steps to reproduce:

@igor-ivanov
Copy link
Collaborator

Could you please provide top of output using VMA_TRACELEVEL=4 VMA_SPEC=latency and VMA_TRACELEVEL=4 VMA_SPEC=latency VMA_INTERNAL_THREAD_AFFINITY=2. It should include list of VMA parameters used during the launch. See example at https://github.com/Mellanox/libvma/blob/master/README#L86.
Line related VMA_INTERNAL_THREAD_AFFINITY should be enough.
In addition top or htop output in addition in both cases.

@boranby
Copy link
Author

boranby commented Aug 2, 2024

Hi Igor, thanks for your response. You can find the details below. If you need anything else, I can provide you with other information.

  • Running VMA_TRACELEVEL=4 VMA_SPEC=latency VMA_INTERNAL_THREAD_AFFINITY=2

VMA INFO: Internal Thread Affinity 2 [VMA_INTERNAL_THREAD_AFFINITY]

Running top

top - 21:42:47 up 12 min,  3 users,  load average: 2.51, 1.72, 1.03
Tasks: 497 total,   2 running, 494 sleeping,   0 stopped,   1 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.3 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 95.7 us,  4.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 192484.2 total,  57171.2 free, 133581.9 used,   2336.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  58902.2 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 22799 sonictr+  20   0 2583740  71152   6528 R  68.1   0.0   1:49.07 sonic
 22917 sonictr+  20   0  226420   4736   3456 R   0.3   0.0   0:00.10 top
      1 root      20   0  174948  18680  11040 S   0.0   0.0   0:01.20 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kthreadd
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp
      5 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 slub_flushwq
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 netns
      8 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-events_highpri
     10 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq
  • Running VMA_TRACELEVEL=4 VMA_SPEC=latency

VMA INFO: Internal Thread Affinity 0 [VMA_INTERNAL_THREAD_AFFINITY]

Running top

top - 21:44:25 up 14 min,  3 users,  load average: 2.61, 2.02, 1.22
Tasks: 497 total,   2 running, 494 sleeping,   0 stopped,   1 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.3 hi,  0.3 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 95.7 us,  4.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 192484.2 total,  57174.4 free, 133577.6 used,   2337.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  58906.5 avail Mem

    

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
23011 sonictr+  20   0 2583860  71024   6400 R 102.3   0.0   0:27.08 sonic
23041 sonictr+  20   0  226320   4736   3456 R   0.3   0.0   0:00.04 top
      1 root      20   0  174948  18680  11040 S   0.0   0.0   0:01.20 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kthreadd
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp
      5 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 slub_flushwq
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 netns
      8 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-events_highpri
     10 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq

@igor-ivanov
Copy link
Collaborator

@pasis do you have explanation?

@boranby
Copy link
Author

boranby commented Aug 22, 2024

Hi, @igor-ivanov @pasis is there any update on this issue? Thanks for your help.

@boranby
Copy link
Author

boranby commented Oct 11, 2024

Is there any other way to find a solution or get support from libvma or Mellanox team?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants