-
Notifications
You must be signed in to change notification settings - Fork 454
更换 master 节点
oilbeater edited this page Jun 27, 2022
·
4 revisions
Wiki 下的中文文档将不在维护,请访问我们最新的中文文档网站,获取最新的文档更新。
以删除节点 gateway 为例:
[root@master ~]# kubectl -n kube-system get pod -o wide | grep central
ovn-central-74b5f7b9c5-4wzkj 1/1 Running 0 25m 192.168.50.128 gateway <none> <none>
ovn-central-74b5f7b9c5-76xwg 1/1 Running 0 24m 192.168.50.112 slave <none> <none>
ovn-central-74b5f7b9c5-9knkl 1/1 Running 0 23m 192.168.50.134 master <none> <none>
[root@master ~]# kubectl -n kube-system get pods -o wide | grep central
ovn-central-74b5f7b9c5-bztcl 1/1 Running 0 15m 192.168.50.134 master <none> <none>
ovn-central-74b5f7b9c5-gwml6 1/1 Running 0 3m57s 192.168.50.128 gateway <none> <none>
ovn-central-74b5f7b9c5-p9zfn 1/1 Running 0 15m 192.168.50.112 slave <none> <none>
[root@master ~]# kubectl -n kube-system exec -ti ovn-central-74b5f7b9c5-gwml6 bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@gateway:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
8332
Name: OVN_Northbound
Cluster ID: 276a (276a7bb3-51a9-473e-b2af-78bfe813489d)
Server ID: 8332 (8332545e-8d9e-484a-917b-5c46b866daa3)
Address: tcp:[192.168.50.128]:6643
Status: cluster member
Role: follower
Term: 41
Leader: 5eef
Vote: unknown
Election timer: 5000
Log: [46367, 65727]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 ->68d3 <-5eef <-68d3
Disconnections: 0
Servers:
68d3 (68d3 at tcp:[192.168.50.134]:6643) last msg 243572 ms ago
8332 (8332 at tcp:[192.168.50.128]:6643) (self)
5eef (5eef at tcp:[192.168.50.112]:6643) last msg 179 ms ago
记录该节点的名字为 8332
.
可以从 Servers 条目下找到该节点对应的 id. 也可从第一行找到对应 id. 可相互验证.
可就近在该节点中执行以下命令将该节点踢出集群
root@gateway:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/kick OVN_Northbound 8332
root@gateway:/kube-ovn#
root@gateway:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
unknown cluster
ovs-appctl: /var/run/ovn/ovnnb_db.ctl: server returned an error
root@gateway:/kube-ovn# exit
可以看到查看 status 的命令失败了. 此时可以去另一个节点上查看集群状态.
[root@master ~]# kubectl -n kube-system get pods -o wide | grep central
ovn-central-74b5f7b9c5-bztcl 1/1 Running 0 29m 192.168.50.134 master <none> <none>
ovn-central-74b5f7b9c5-gwml6 1/1 Running 0 17m 192.168.50.128 gateway <none> <none>
ovn-central-74b5f7b9c5-p9zfn 1/1 Running 0 29m 192.168.50.112 slave <none> <none>
[root@master ~]# kubectl -n kube-system exec -ti ovn-central-74b5f7b9c5-bztcl bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@master:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
68d3
Name: OVN_Northbound
Cluster ID: 276a (276a7bb3-51a9-473e-b2af-78bfe813489d)
Server ID: 68d3 (68d3bfbd-65b1-47fa-a35e-083f394a32e3)
Address: tcp:[192.168.50.134]:6643
Status: cluster member
Role: follower
Term: 41
Leader: 5eef
Vote: 5eef
Last Election started 1780560 ms ago, reason: timeout
Election timer: 5000
Log: [46367, 65728]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->5eef <-5eef
Disconnections: 2
Servers:
68d3 (68d3 at tcp:[192.168.50.134]:6643) (self)
5eef (5eef at tcp:[192.168.50.112]:6643) last msg 1404 ms ago
root@master:/kube-ovn#
可以看到踢出集群的操作已经成功.
[root@master ~]# kubectl -n kube-system exec -ti ovn-central-74b5f7b9c5-gwml6 bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@gateway:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
4049
Name: OVN_Southbound
Cluster ID: aa98 (aa985c1d-be8a-46e2-9a22-d3164c0042a7)
Server ID: 4049 (4049c11e-a07e-44ab-8c6b-549bc6172590)
Address: tcp:[192.168.50.128]:6644
Status: cluster member
Role: follower
Term: 44
Leader: d52b
Vote: unknown
Election timer: 5000
Log: [47459, 65157]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 ->7e31 <-d52b <-7e31
Disconnections: 0
Servers:
d52b (d52b at tcp:[192.168.50.112]:6644) last msg 1406 ms ago
4049 (4049 at tcp:[192.168.50.128]:6644) (self)
7e31 (7e31 at tcp:[192.168.50.134]:6644) last msg 1314818 ms ago
root@gateway:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/kick OVN_Southbound 4049
root@gateway:/kube-ovn# exit
exit
[root@master ~]# kubectl -n kube-system exec -ti ovn-central-74b5f7b9c5-bztcl bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@master:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
7e31
Name: OVN_Southbound
Cluster ID: aa98 (aa985c1d-be8a-46e2-9a22-d3164c0042a7)
Server ID: 7e31 (7e31e71e-d6f3-4fa0-8a87-7aaa9d9d6b00)
Address: tcp:[192.168.50.134]:6644
Status: cluster member
Role: follower
Term: 44
Leader: d52b
Vote: d52b
Election timer: 5000
Log: [47459, 65158]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->d52b <-d52b
Disconnections: 2
Servers:
d52b (d52b at tcp:[192.168.50.112]:6644) last msg 92 ms ago
7e31 (7e31 at tcp:[192.168.50.134]:6644) (self)
rm -rf /etc/origin/ovn/ /var/run/ovn/ /etc/ovn
监控 ovn 组件是否受到影响. 若受到影响则可删除 pod 使其重启
kubectl label node gateway kube-ovn/role-
kubectl scale deployment -n kube-system ovn-central --replicas=2
kubectl set env deployment/ovn-central -n kube-system NODE_IPS="192.168.50.134,192.168.50.112"
kubectl rollout status deployment/ovn-central -n kube-system
/etc/origin/ovn/
/var/run/ovn/
/etc/ovn
不存在的节点可能会导致在选举 leader 时由于投票数量不足而无法选出 leader. 所以一定要保证节点都有对应.
# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
# ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
kubectl label node `NODE` kube-ovn/role=master
kubectl scale deployment -n kube-system ovn-central --replicas=3
kubectl set env deployment/ovn-central -n kube-system NODE_IPS="192.168.50.134,192.168.50.112,192.168.50.85"
kubectl rollout status deployment/ovn-central -n kube-system