-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metron branch is broken #243
Comments
Which binutils? Which GCC? NSLab racks? I would bet on binutils, those developers are cow boys^^ |
nslrack06-07-08 (Ubuntu 18.04.4, kernel 4.15.0-91-generic) Is there a known issue with binutils? |
Yes, with 2.30 the code will crash with Xeon Skylake and higher, because they did an incorrect AVX512 optimization. With 2.34 we observed a problem similar to this one where some code was optimized out but actually still called... Reverting to 2.32 worked in that case :p But then maybe it's different here. Will take a look. |
Racks 06-08 are Haswell-based; probably this problems holds for this architecture too. |
Also, I noticed something which is not related to the bug but caught my attention. |
Interestingly, on rack14 (Skylake) the dpdk-bounce works fine, but metron is still problematic when spawning secondary processes (try_slave() method): EAL: PCI device 0000:17:00.0 on NUMA socket 0 |
--enable-cpu-load suffered a bad merge for sure. I'm finishing something and then will look at it. |
For me it works. Maybe you should recompile both DPDK and Click, cleaning before from the same machine? |
Did you also try Metron with a Mellanox NIC? Which machine did you use? |
I just tried to launch (Mellanox yes) and did not get the messages you had. Rack 05 |
Problem found: When passing the following configuration to the Metron element: |
RSS and VMDq-based service chain deployments crash in run_service_chain() method (Child part, just before or during DPDK initialization). See the output below (RSS-based deployment): Writing configuration: elementclass MetronSlave { slave :: MetronSlave(); slaveFD0C0 :: FromDPDKDevice(0, QUEUE 0, N_QUEUES 1, MAXTHREADS 1, BURST 32, NUMA false, VERBOSE 99, ACTIVE 1); slaveTD0 :: ExactCPUSwitch(); Initializing flow parser... |
Could you run it under gdb? Compiled with "-O1 -g"? As it's the slave you can run it with |
Is this fixed? |
I could not get the stacktrace of the slave, so I abandoned. |
Even the simplest FastClick app is broken in the Metron branch.
Issues occur with conf/metron/metron-dispatcher-flow.click when launching secondary processes.
sudo gdb --args bin/click --dpdk -w 0000:03:00.0 -- conf/dpdk/dpdk-bounce.click
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/click...done.
(gdb) r
Starting program: /home/katsikas/nfv/projects/fastclick/bin/click --dpdk -w 0000:03:00.0 -- conf/dpdk/dpdk-bounce.click
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
[New Thread 0x7ffff0199700 (LWP 10006)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
[New Thread 0x7fffef998700 (LWP 10007)]
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
EAL: VFIO support initialized
[New Thread 0x7fffef197700 (LWP 10008)]
[New Thread 0x7fffee996700 (LWP 10009)]
[New Thread 0x7fffee195700 (LWP 10010)]
[New Thread 0x7fffed994700 (LWP 10011)]
[New Thread 0x7fffed193700 (LWP 10012)]
[New Thread 0x7fffec992700 (LWP 10013)]
[New Thread 0x7fffec191700 (LWP 10014)]
[New Thread 0x7fffeb990700 (LWP 10015)]
[New Thread 0x7fffeb18f700 (LWP 10016)]
[New Thread 0x7fffea98e700 (LWP 10017)]
[New Thread 0x7fffea18d700 (LWP 10018)]
[New Thread 0x7fffe998c700 (LWP 10019)]
[New Thread 0x7fffe918b700 (LWP 10020)]
[New Thread 0x7fffe898a700 (LWP 10021)]
[New Thread 0x7fffe8189700 (LWP 10022)]
EAL: PCI device 0000:03:00.0 on NUMA socket 0
EAL: probe driver: 15b3:1017 net_mlx5
Initializing flow parser...
Initializing DPDK
Ingress traffic on port 0 is not restricted anymore to the defined flow rules
deleted virtual method called
terminate called without an active exception
Thread 1 "click" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007ffff5ced801 in __GI_abort () at abort.c:79
#2 0x00007ffff66e0957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff66e6ae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff66e6b21 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff66e791f in __cxa_deleted_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00005555560aa48a in Router::initialize (this=, errh=0x555556de0f90) at ../lib/router.cc:1451
#7 0x00005555560444db in parse_configuration (text=..., text_is_expr=, hotswap=, errh=0x555556de0f90) at click.cc:404
#8 0x00005555556fb58e in main (argc=, argv=) at click.cc:739
The text was updated successfully, but these errors were encountered: