From 04f8881f5795483968987d6d7e8b522c11f2d06e Mon Sep 17 00:00:00 2001 From: Ilya Maximets Date: Tue, 8 Jun 2021 15:17:23 +0200 Subject: [PATCH] ovsdb-idl: Fix the database update signaling if it has never been connected. The symptom of this issue is that OVS bridge looses its IP address on restart. Simple reproducer: 0. start ovsdb-server and ovs-vswitchd 1. ovs-vsctl add-br br0 2. ifconfig br0 10.0.0.1 up 3. ovs-appctl -t ovs-vswitchd exit 4. start ovs-vswitchd back. After step #3 ovs-vswitchd is down, but br0 interface exists and has configured IP address. After step #4 there is no IP address on the port br0. What happened: 1. ovsdb-cs connects to the database via ovsdb-idl and requests database lock. --> get_schema for _Server database --> lock request 2. ovsdb-cs receives schema for the _Server database. And sends monitor request. <-- schema for _Server --> monitor_cond for _Server 3. ovsdb-cs receives lock reply. <-- locked At this point ovsdb-cs generates OVSDB_CS_EVENT_TYPE_LOCKED event and passes it to ovsdb-idl. ovsdb-idl increases change_seqno. 4. ovsdb_idl_has_ever_connected() is 'true' now, because change_seqno is not zero. 5. ovs-vswitchd decides that it has connection with database and all the initial data, therefore initiates configuration of bridges. bridge_run():ovsdb_idl_has_ever_connected() --> true 6. Since monitor request for the Open_vSwitch database is not even sent yet, the database is empty. This leads to removal of all the ports and all other resources. 7. When data finally received, ovs-vswitchd re-creates bridges and ports, but IP addresses can not be restored. While splitting out ovsdb-cs from ovsdb-idl one part of the logic was lost. Particularly, before the split, ovsdb-idl updated change_seqno only in MONITORING state. Restoring the logic by updating the change_seqno only if may send transaction, i.e. lock is ours and ovsdb-cs is in the MONITORING state. This matches with the main purpose of increasing change_seqno at this point, i.e. to force the client to re-try the transaction. With this change ovsdb_idl_has_ever_connected() remains 'false' until the first monitor reply with the actual data received. This issue was reported several times during the last couple of weeks. Reported-at: https://bugzilla.redhat.com/1968445 Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383512.html Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-June/051222.html Fixes: 1c337c43ac1c ("ovsdb-idl: Break into two layers.") Signed-off-by: Ilya Maximets Acked-by: Dumitru Ceara --- lib/ovsdb-idl.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 1d385ca2f..2198c69c6 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -429,9 +429,15 @@ ovsdb_idl_run(struct ovsdb_idl *idl) break; case OVSDB_CS_EVENT_TYPE_LOCKED: - /* If the client couldn't run a transaction because it didn't have - * the lock, this will encourage it to try again. */ - idl->change_seqno++; + if (ovsdb_cs_may_send_transaction(idl->cs)) { + /* If the client couldn't run a transaction because it didn't + * have the lock, this will encourage it to try again. */ + idl->change_seqno++; + } else { + /* We're setting up a session, so don't signal that the + * database changed. Finalizing the session will increment + * change_seqno anyhow. */ + } break; case OVSDB_CS_EVENT_TYPE_UPDATE: