Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

From 3.27 to upgrade 3.28, IPPool Issue #9100

Open
kkbruce opened this issue Aug 6, 2024 · 5 comments
Open

From 3.27 to upgrade 3.28, IPPool Issue #9100

kkbruce opened this issue Aug 6, 2024 · 5 comments

Comments

@kkbruce
Copy link

kkbruce commented Aug 6, 2024

We reference upgrade docs (uses the operator) to upgrade the Calico version from 3.27 to 3.28.

$ calicoctl version
Client Version:    v3.28.1
Git commit:        601856343
Cluster Version:   v3.28.1
Cluster Type:      typha,kdd,k8s,operator,bgp,kubeadm,win

Expected Behavior

We can update it back to disabled: true or delete the old default-ipv4-ippool configuration.

Current Behavior

At 3.27, we set up a new IPPool according to the document and have already set disabled: true and it was working fine. However, after upgrading to 3.28, we found that the original disabled: true was reset to false, and we cannot update it back to true or delete the old default-ipv4-ippool configuration as described in the steps below under "Steps to Reproduce".

Possible Solution

Is it possible to have a downgraded restore file or steps, so that when there is a problem with the upgrade, it can be quickly repaired to a normal working version or state?

Steps to Reproduce (for bugs)

$ calicoctl get ippool -o wide
NAME                  CIDR             NAT    IPIPMODE   VXLANMODE     DISABLED   DISABLEBGPEXPORT   SELECTOR
default-ipv4-ippool   192.168.0.0/16   true   Never      CrossSubnet   false      false              all()
new-pool              10.244.0.0/16    true   Never      CrossSubnet   false      false              all()

### add disabled: true
$ calicoctl get ippool -o yaml > pools.yaml
$ vim pools.yaml
$ calicoctl apply -f pools.yaml
Successfully applied 2 'IPPool' resource(s)
$ calicoctl get ippool -o wide
NAME                  CIDR             NAT    IPIPMODE   VXLANMODE     DISABLED   DISABLEBGPEXPORT   SELECTOR
default-ipv4-ippool   192.168.0.0/16   true   Never      CrossSubnet   false      false              all()
new-pool              10.244.0.0/16    true   Never      CrossSubnet   false      false              all()

### delete default-ipv4-ippool
$ calicoctl delete pool default-ipv4-ippool
Successfully deleted 1 'IPPool' resource(s)
$ calicoctl get ippool -o wide
NAME                  CIDR             NAT    IPIPMODE   VXLANMODE     DISABLED   DISABLEBGPEXPORT   SELECTOR
default-ipv4-ippool   192.168.0.0/16   true   Never      CrossSubnet   false      false              all()
new-pool              10.244.0.0/16    true   Never      CrossSubnet   false      false              all()

Context

The original default 192.168.x.x network segment conflicted with other internal network segments, causing abnormal access to the 192.168.x.x services of the Pod containers in the internal network. Therefore, the default value was modified to 10.244.x.x, and after disabled: true by default-ipv4-ippool, the entire network access became normal.

Your Environment

  • Calico version : 3.28.1
  • Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes
  • Operating System and version: Ubuntu 20.04
  • Link to your project (optional): None
@kkbruce
Copy link
Author

kkbruce commented Aug 6, 2024

Use the command to get the same result.

$ calicoctl patch ippool default-ipv4-ippool -p '{"spec": {"disabled": true}}'

$ calicoctl get ippool -o wide
NAME                  CIDR             NAT    IPIPMODE   VXLANMODE     DISABLED   DISABLEBGPEXPORT   SELECTOR
default-ipv4-ippool   192.168.0.0/16   true   Never      CrossSubnet   false      false              all()
new-pool              10.244.0.0/16    true   Never      CrossSubnet   false      false              all()

@kkbruce
Copy link
Author

kkbruce commented Aug 6, 2024

Do I need to delete all Pods?

Currently, it can be confirmed that no Pods are using the IP address 192.168.x.x .

$ calicoctl ipam show --show-blocks
+----------+------------------+-----------+------------+--------------+
| GROUPING |       CIDR       | IPS TOTAL | IPS IN USE |   IPS FREE   |
+----------+------------------+-----------+------------+--------------+
| IP Pool  | 192.168.0.0/16   |     65536 | 0 (0%)     | 65536 (100%) |
| IP Pool  | 10.244.0.0/16    |     65536 | 48 (0%)    | 65488 (100%) |
| Block    | 10.244.167.0/26  |        64 | 4 (6%)     | 60 (94%)     |
| Block    | 10.244.28.192/26 |        64 | 42 (66%)   | 22 (34%)     |
| Block    | 10.244.58.192/26 |        64 | 2 (3%)     | 62 (97%)     |
+----------+------------------+-----------+------------+--------------+

@kkbruce
Copy link
Author

kkbruce commented Aug 6, 2024

$ sudo calicoctl node diags fils:

diags-20240806_213443.tar_20240806214137.gz

@caseydavenport
Copy link
Member

@kkbruce in Calico v3.28, the operator has been updated to reconcile changes to IP pools.

If your IP pool is defined within your Installation, the operator will attempt to make sure that the actual IP pool in the cluster matches the one in your Installation. I suspect that is what is happening here.

If you don't want to use the 192.168.0.0 IP pool, you should just be able to delete it (from the Installation) - unless you want it for other reasons like NAT?

@kkbruce
Copy link
Author

kkbruce commented Aug 12, 2024

Due to the need to quickly restore the Calico CNI network to a functional state, we operated to downgrade to version 3.27. Currently, there is no temporary environment available for more information on version 3.28.

From another perspective, we referred to the migrate-pools document. In the migrate-pools document before version 3.27, there was no mention of operations such as Operator (kubectl edit installation default). Therefore, in version 3.28, we need to become more familiar with the Yaml configuration of the Installation itself and dare keep the same settings the same. However, from the information provided above, it can be seen that the Manifest operation of migrate-pools in version 3.28 could be more effective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants