-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WiP]: Initial DDlog scaffolding in controller #4
base: ddlog-controller
Are you sure you want to change the base?
Conversation
- Create dummy output relation in ovn_controller.dl - Populate this output relation programmatically inside ovn-controller main
For now, the way to test whether things are working correctly is to run |
controller/ovn-controller.c
Outdated
|
||
input_table = ddlog_get_table_id(prog, input_relation); | ||
if (input_table == -1) | ||
VLOG_INFO(" ddlog_get_table_id returned -1 for %s, %d\n", input_relation, input_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main problem is that on all these errors you seemingly continue through the normal path as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just to get stuff building. We definitely would need better error handling for production but it's a start!
controller/ovn-controller.c
Outdated
@@ -2720,6 +2734,62 @@ main(int argc, char *argv[]) | |||
OVS_NODES | |||
#undef OVS_NODE | |||
|
|||
/*** Begin DDlog test code ***/ | |||
// DDlog vars | |||
ddlog_prog prog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what this "main" is for. I wonder how control reaches this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main code of ovn-controller. This new code will run as part of the daemon startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting a lot of compiler warnings and errors here. All minor but should be fixed. You can use OVS_UNUSED
to silence warnings about unused parameters.
../controller/ovn-controller.c: In function ‘print_records_callback’:
../controller/ovn-controller.c:2567:46: error: unused parameter ‘arg’ [-Werror=unused-parameter]
../controller/ovn-controller.c:2567:84: error: unused parameter ‘weight’ [-Werror=unused-parameter]
In file included from ../controller/ovn-controller.c:47:
../controller/ovn-controller.c: In function ‘main’:
../controller/ovn-controller.c:2752:19: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘table_id’ {aka ‘long unsigned int’} [-Werror=format=]
/home/bpfaff/nicira/ovs/include/openvswitch/vlog.h:280:41: note: in definition of macro ‘VLOG’
../controller/ovn-controller.c:2752:9: note: in expansion of macro ‘VLOG_INFO’
../controller/ovn-controller.c:2752:61: note: format string is defined here
In file included from ../controller/ovn-controller.c:47:
../controller/ovn-controller.c:2756:19: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘table_id’ {aka ‘long unsigned int’} [-Werror=format=]
/home/bpfaff/nicira/ovs/include/openvswitch/vlog.h:280:41: note: in definition of macro ‘VLOG’
../controller/ovn-controller.c:2756:9: note: in expansion of macro ‘VLOG_INFO’
../controller/ovn-controller.c:2756:60: note: format string is defined here
This is benign but AddressSanitizer was complaining about it: Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7fa93cd6d667 in __interceptor_malloc (/lib64/libasan.so.6+0xb0667) #1 0x1be8d3e in xmalloc__ lib/util.c:137 #2 0x1be8e1a in xmalloc lib/util.c:172 #3 0x1b799d3 in json_create lib/json.c:1451 #4 0x1b74314 in json_integer_create lib/json.c:263 #5 0x1b7d38a in jsonrpc_create_id lib/jsonrpc.c:563 #6 0x1b7d3a5 in jsonrpc_create_request lib/jsonrpc.c:570 #7 0x1b8d851 in ovsdb_cs_send_transaction lib/ovsdb-cs.c:1376 #8 0x40b017 in northd_send_output_only_data_request northd/ovn-northd-ddlog.c:290 #9 0x40c802 in northd_run northd/ovn-northd-ddlog.c:568 #10 0x410225 in main northd/ovn-northd-ddlog.c:1289 #11 0x7fa93c4a9081 in __libc_start_main ../csu/libc-start.c:308 Signed-off-by: Dumitru Ceara <[email protected]> Acked-by: Ben Pfaff <[email protected]>
This commit contains: - new controller/ddlog.h header file to encapsulate ddlog boilerplate - move insertion of DDlog input relation records to controller/lflow.c This might not be the permanent place: we need to identify the location to place the hook that writes OVSDB updates to the ovn-controller DDlog's Logical_Flow input relation. Still TODO: - consider_logical_flow logic should eventually be encapsulated in DDlog code - find permanent location to insert OVSDB updates to ovn-controller DDlog instance
… Logical Flows Limitations: - For now, only track address_set and port_group dependencies - Don't have the logic for creating the actual OpenFlow flows
While decoding RAW_ENCAP action, decode_ed_prop() might re-allocate ofpbuf if there is no enough space left. However, function 'decode_NXAST_RAW_ENCAP' continues to use old pointer to 'encap' structure leading to write-after-free and incorrect decoding. ==3549105==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600000011a at pc 0x0000005f6cc6 bp 0x7ffc3a2d4410 sp 0x7ffc3a2d4408 WRITE of size 2 at 0x60600000011a thread T0 #0 0x5f6cc5 in decode_NXAST_RAW_ENCAP lib/ofp-actions.c:4461:20 #1 0x5f0551 in ofpact_decode ./lib/ofp-actions.inc2:4777:16 #2 0x5ed17c in ofpacts_decode lib/ofp-actions.c:7752:21 #3 0x5eba9a in ofpacts_pull_openflow_actions__ lib/ofp-actions.c:7791:13 #4 0x5eb9fc in ofpacts_pull_openflow_actions lib/ofp-actions.c:7835:12 #5 0x64bb8b in ofputil_decode_packet_out lib/ofp-packet.c:1113:17 #6 0x65b6f4 in ofp_print_packet_out lib/ofp-print.c:148:13 #7 0x659e3f in ofp_to_string__ lib/ofp-print.c:1029:16 #8 0x659b24 in ofp_to_string lib/ofp-print.c:1244:21 #9 0x65a28c in ofp_print lib/ofp-print.c:1288:28 #10 0x540d11 in ofctl_ofp_parse utilities/ovs-ofctl.c:2814:9 #11 0x564228 in ovs_cmdl_run_command__ lib/command-line.c:247:17 #12 0x56408a in ovs_cmdl_run_command lib/command-line.c:278:5 #13 0x5391ae in main utilities/ovs-ofctl.c:179:9 #14 0x7f6911ce9081 in __libc_start_main (/lib64/libc.so.6+0x27081) #15 0x461fed in _start (utilities/ovs-ofctl+0x461fed) Fix that by getting a new pointer before using. Credit to OSS-Fuzz. Fuzzer regression test will fail only with AddressSanitizer enabled. Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27851 Fixes: f839892 ("OF support and translation of generic encap and decap") Acked-by: William Tu <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
ovsdb_cs_send_transaction() returns the pointer to the same 'request_id' object that is used internally. This leads to situation where transaction in idl and CS module has the same 'request_id' object. However, CS module is able to destroy this transaction id at any time, e.g. if connection state chnaged, but idl transaction might be still around at this moment and application might still use it. Found by running 'make check-ovsdb-cluster' with AddressSanitizer: ==79922==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000167a98 at pc 0x000000626acf bp 0x7ffcdb38a4c0 sp 0x7ffcdb38a4b8 READ of size 8 at 0x604000167a98 thread T0 #0 0x626ace in json_destroy lib/json.c:354:18 #1 0x56d1ab in ovsdb_idl_txn_destroy lib/ovsdb-idl.c:2528:5 #2 0x53a908 in do_vsctl utilities/ovs-vsctl.c:3008:5 #3 0x539251 in main utilities/ovs-vsctl.c:203:17 #4 0x7f7f7e376081 in __libc_start_main (/lib64/libc.so.6+0x27081) #5 0x461fed in _start (utilities/ovs-vsctl+0x461fed) 0x604000167a98 is located 8 bytes inside of 40-byte region [0x604000167a90,0x604000167ab8) freed by thread T0 here: #0 0x503ac7 in free (utilities/ovs-vsctl+0x503ac7) #1 0x626aae in json_destroy lib/json.c:378:9 #2 0x6adfa2 in ovsdb_cs_run lib/ovsdb-cs.c:625:13 #3 0x567731 in ovsdb_idl_run lib/ovsdb-idl.c:394:5 #4 0x56fed1 in ovsdb_idl_txn_commit_block lib/ovsdb-idl.c:3187:9 #5 0x53a4df in do_vsctl utilities/ovs-vsctl.c:2898:14 #6 0x539251 in main utilities/ovs-vsctl.c:203:17 #7 0x7f7f7e376081 in __libc_start_main previously allocated by thread T0 here: #0 0x503dcf in malloc (utilities/ovs-vsctl+0x503dcf) #1 0x594656 in xmalloc lib/util.c:138:15 #2 0x626431 in json_create lib/json.c:1451:25 #3 0x626972 in json_integer_create lib/json.c:263:25 #4 0x62da0f in jsonrpc_create_id lib/jsonrpc.c:563:12 #5 0x62d9a8 in jsonrpc_create_request lib/jsonrpc.c:570:23 #6 0x6af3a6 in ovsdb_cs_send_transaction lib/ovsdb-cs.c:1357:35 #7 0x56e3d5 in ovsdb_idl_txn_commit lib/ovsdb-idl.c:3147:27 #8 0x56fea9 in ovsdb_idl_txn_commit_block lib/ovsdb-idl.c:3186:22 #9 0x53a4df in do_vsctl utilities/ovs-vsctl.c:2898:14 #10 0x539251 in main utilities/ovs-vsctl.c:203:17 #11 0x7f7f7e376081 in __libc_start_main Fixes: 1c337c4 ("ovsdb-idl: Break into two layers.") Acked-by: Dumitru Ceara <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
…nected. The symptom of this issue is that OVS bridge looses its IP address on restart. Simple reproducer: 0. start ovsdb-server and ovs-vswitchd 1. ovs-vsctl add-br br0 2. ifconfig br0 10.0.0.1 up 3. ovs-appctl -t ovs-vswitchd exit 4. start ovs-vswitchd back. After step #3 ovs-vswitchd is down, but br0 interface exists and has configured IP address. After step #4 there is no IP address on the port br0. What happened: 1. ovsdb-cs connects to the database via ovsdb-idl and requests database lock. --> get_schema for _Server database --> lock request 2. ovsdb-cs receives schema for the _Server database. And sends monitor request. <-- schema for _Server --> monitor_cond for _Server 3. ovsdb-cs receives lock reply. <-- locked At this point ovsdb-cs generates OVSDB_CS_EVENT_TYPE_LOCKED event and passes it to ovsdb-idl. ovsdb-idl increases change_seqno. 4. ovsdb_idl_has_ever_connected() is 'true' now, because change_seqno is not zero. 5. ovs-vswitchd decides that it has connection with database and all the initial data, therefore initiates configuration of bridges. bridge_run():ovsdb_idl_has_ever_connected() --> true 6. Since monitor request for the Open_vSwitch database is not even sent yet, the database is empty. This leads to removal of all the ports and all other resources. 7. When data finally received, ovs-vswitchd re-creates bridges and ports, but IP addresses can not be restored. While splitting out ovsdb-cs from ovsdb-idl one part of the logic was lost. Particularly, before the split, ovsdb-idl updated change_seqno only in MONITORING state. Restoring the logic by updating the change_seqno only if may send transaction, i.e. lock is ours and ovsdb-cs is in the MONITORING state. This matches with the main purpose of increasing change_seqno at this point, i.e. to force the client to re-try the transaction. With this change ovsdb_idl_has_ever_connected() remains 'false' until the first monitor reply with the actual data received. This issue was reported several times during the last couple of weeks. Reported-at: https://bugzilla.redhat.com/1968445 Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383512.html Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-June/051222.html Fixes: 1c337c4 ("ovsdb-idl: Break into two layers.") Signed-off-by: Ilya Maximets <[email protected]> Acked-by: Dumitru Ceara <[email protected]>
Opening a permanent PR to keep track of our ovn-controller changes.
This initial commit creates a dummy output relation and populates it programmatically.
Still todo: