-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-Cluster Service Connectivity Fails with "Host is Unreachable" Despite Successful DNS Resolution in Submariner GlobalNet Setup #3204
Comments
Thanks for reaching out @aswinayyolath. A. As mentioned in Slack discussion, inter-cluster libreswan tunnel is up and communication between gw nodes is fine while communication from non-GW node to gw node is failing. further dapath investigation is needed here, I assume that for some reason (maybe infra firewall, connection tracking) ingress packet is being dropped in gwnode@clusterX to nongwnode@clusterX segment. Can you please run ping from non-gw node@sub1 to gw-node@sub2 (for gw-node@sub2 IP address you should use endpoint healthcheck IP == 242.0.255.254) and tcpdump the gw node and non-gw node on cluster sub1 ? B. Also, this is not relevant to datapath issue, but I noticed that Submariner detected the CNI as generic instead of flannel, Submariner uses this code to discover network details for flannel CNI. |
DaemonSet List:
Checked Pods
CNI Configuration
|
Is there flannel daemonset in another namespace? |
Yes
|
The kube-flannel-ds DaemonSet has the following volumes volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni-plugin
hostPath:
path: /opt/cni/bin
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
CM details
apiVersion: v1
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"EnableNFTables": false,
"Backend": {
"Type": "vxlan"
}
}
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"cni-conf.json":"{\n \"name\": \"cbr0\",\n \"cniVersion\": \"0.3.1\",\n \"plugins\": [\n {\n \"type\": \"flannel\",\n \"delegate\": {\n \"hairpinMode\": true,\n \"isDefaultGateway\": true\n }\n },\n {\n \"type\": \"portmap\",\n \"capabilities\": {\n \"portMappings\": true\n }\n }\n ]\n}\n","net-conf.json":"{\n \"Network\": \"10.244.0.0/16\",\n \"EnableNFTables\": false,\n \"Backend\": {\n \"Type\": \"vxlan\"\n }\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"flannel","k8s-app":"flannel","tier":"node"},"name":"kube-flannel-cfg","namespace":"kube-flannel"}}
creationTimestamp: "2024-11-03T16:24:55Z"
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-cfg
namespace: kube-flannel
resourceVersion: "282"
uid: f0058e4b-4ba9-49be-a759-fd0c9843a88d
|
Thanks for the information, Regarding flannel discovery, it looks like we need to update flannel discovery code. QQ: does Could you please report a new issue for flannel CNI discovery? please attach relevant information, we welcome any code contribution here :-) . As per the datapath issue, traffic initiated at nongw node@clusterA towards remoter cluster is encapsulated in VxLAN (port 4800, interface vx-submariner) towards gw node@clusterA and gw node should forward it to remote cluster gw. Can you double check (maybe use tcpdump -pi ) that no packet is sent in nonGW node ? I can see that on gw node iptables (filter table) packet counter for input traffic on vx-submariner interface is > 0 , check [1] . [1] |
QQ: does kubectl get ds -A -l k8s-app=flannel return flannel ds ?
I will report a new issue and see if I can contribute (I guess changes should be relatively small) ... Packet Transmission on the Non-GW Node in sub1 (ClusterA)
Run Packet Capture on the Non-Gateway Node in ClusterA
Verified Reception on the Gateway Node in ClusterA
is this what you want me to do? I am not 100% sure |
I have created a new issue: #3210. A draft change has been pushed here: #3268. @yboaron, I haven't yet looked into linting, unit tests, or e2es testing; I'm just checking if the changes look something like this (Draft linked above). I also modified the loop structure from for k := range daemonsets.Items {
if strings.Contains(daemonsets.Items[k].Name, "flannel") {
volumes = daemonsets.Items[k].Spec.Template.Spec.Volumes to for _, ds := range daemonsets.Items {
if strings.Contains(ds.Name, "flannel") {
flannelDaemonSet = &ds
volumes = ds.Spec.Template.Spec.Volumes
break
}
} to enhance code readability and clarity. I think thid approach makes it clear that ds represents a DaemonSet obj, eliminating the need for indexing. Additionally, by storing a pointer to the found DS and breaking the loop upon finding it, I believe if we do something like this the code becomes more efficient and reduces the risk of errors associated with accessing elements via an index. |
Submariner only handles egress routing and only for packets destined to remote clusters (dest IP is from remote pod,service CIDRs, in your case it is globalNet CIDR for remote cluster) , please tcpdump while pinging remote endpoint healthcheck IP address |
Can you try running |
I am seeing a lot of output from
|
Hmmmm, its ICMP/IPv6 traffic , don't you get any ICMP/IPv4 ( |
|
Hmm, strange, can't see the V4 icmp sent to remote cluster. |
I don't have the cluster with me 😔. But I will create one (in fact 2). @yboaron I would like to check with you if the steps I am following is correct or not. Could you please review the Steps here (https://kubernetes.slack.com/archives/C010RJV694M/p1730390398271589?thread_ts=1730390376.380879&cid=C010RJV694M) and let me know If I am missing anything please? |
I would also like to test the same in AWS across 2 regions. I just want to know if the steps I followed is correct and I will try it in Both the VM I used before as well as I will create 2 EKS cluster in 2 diff regions in AWS and see if that works |
Yep, looks fine. Can you try reinstalling without adding --globalnet-cidr 242.0.0.0/16 flag in subctl join command for both clusters |
Hello @yboaron. Since @aswinayyolath is busy with some other tasks, I'm looking at this issue. We're on the same team working on the same project. Since we have same CIDRs on our K8s clusters we cannot have submariner run without global net. To counter this we created an AWS account and then tried to run submariner on EKS. But this does not work and gives these outputs while running diagnostics
I suspect that there is some issue with setting up the subnets. What is something I should try next to get submariner up and running on AWS? |
Maybe you can follow this link ? In case deployment fails please attach debug details from clusters (subctl gather , subctl diagnose all ) ? |
I have tried using
|
As per @aswinsuryan 's suggestion we switched to Calico instead of Flannel Steps Performed
@rohan-anilkumar could you please upload the output of |
@aswinayyolath, did you follow the instructions for Submariner with Calico ? |
I think Yes! but not 100% sure I will ask @rohan-anilkumar to confirm we saw about this here |
@aswinayyolath @yboaron we haven't installed the Calico API server. From the link it seems like Calico API server needs to be installed for it to run. |
Cluster 1
Cluster 2
Since both the cluster have the same Service CIDR 10.96.0.0/12 and Pod CIDR 192.168.0.0/16, this configuration will result in overlapping CIDRs. @yboaron Can I use globalnet in this Case and proceed with next set of steps |
I have tried I'm getting same issue even with Kind.. OS details and versions of Binaries used
|
IIRC, there were other users who also reported problems deploying Kind on Ubuntu in the past, probably due to environment configuration issues A. I usually apply [1] script on my host before deploying Submariner on Kind, could you check that? [1] sudo systemctl stop firewalld |
@yboaron I found a nice blog that you have written recently, Thanks for that Instead of using Calium, I used Calico for both the clusters and I was able to test connectivity
I used below steps # Download the latest Kind binary
curl -Lo ./kind https://kind.sigs.k8s.io/dl/latest/kind-linux-amd64
# Make the Kind binary executable
chmod +x ./kind
# Move the binary to PATH
sudo mv ./kind /usr/local/bin/kind
# Verify the installation
kind --version
# Download the latest stable release of kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
# Make the kubectl binary executable
chmod +x kubectl
# Move the binary to PATH
sudo mv kubectl /usr/local/bin/kubectl
# Verify the installation
kubectl version --client
# Download the latest Subctl release
curl -Ls https://get.submariner.io | bash
export PATH=$PATH:~/.local/bin
echo export PATH=\$PATH:~/.local/bin >> ~/.profile
# Verify the installation
subctl version
# Install Docker
snap install docker
# Install Make
apt install make
# Install Kind with Calcio CNI
git clone https://github.com/submariner-io/shipyard.git
cd shipyard/
cat > deploy.two.clusters.nocni.yaml << EOF
nodes: control-plane worker
clusters:
cluster1:
cni: none
cluster2:
cni: none
EOF
make SETTINGS=deploy.two.clusters.nocni.yaml clusters
# increase inotify resource limits.
sudo sysctl fs.inotify.max_user_watches=524288
sudo sysctl fs.inotify.max_user_instances=512
## List clusters
kind get clusters
# Check the current Context
export KUBECONFIG=$(find $(git rev-parse --show-toplevel)/output/kubeconfigs/ -type f -printf %p:)
kubectl config get-contexts
# Confirm that we have two nodes in each cluster
kubectl --context cluster1 get nodes
kubectl --context cluster2 get nodes
# Deploy Calico on cluster1
kubectl --context cluster1 create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/tigera-operator.yaml
mkdir calico_manifests
wget -O calico_manifests/custom-resources.yaml https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/custom-resources.yaml
sed -i 's,cidr: 192.168.0.0/16,cidr: 10.130.0.0/16,g' calico_manifests/custom-resources.yaml
sed -i 's,VXLANCrossSubnet,VXLAN,g' calico_manifests/custom-resources.yaml
kubectl --context cluster1 apply -f calico_manifests/custom-resources.yaml
# Install Calico on Cluster 2
kubectl --context cluster2 create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/tigera-operator.yaml
wget -O calico_manifests/custom-resources.yaml https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/custom-resources.yaml
sed -i 's,cidr: 192.168.0.0/16,cidr: 10.131.0.0/16,g' calico_manifests/custom-resources.yaml
sed -i 's,VXLANCrossSubnet,VXLAN,g' calico_manifests/custom-resources.yaml
kubectl --context cluster2 apply -f calico_manifests/custom-resources.yaml
# Deploy Submariner
subctl deploy-broker --context cluster1
subctl join --context cluster1 broker-info.subm --clusterid cluster1 --natt=false
subctl join --context cluster2 broker-info.subm --clusterid cluster2 --natt=false
# check Submariner inter-cluster tunnels status
subctl show connections --context cluster2
subctl show connections --context cluster1
# Verify inter-cluster connectivity
kubectl --context cluster2 create deployment nginx --image=nginx
kubectl --context cluster2 expose deployment nginx --port=80
subctl export service --context cluster2 --namespace default nginx
# Run nettest pod on cluster1 to access the nginx service
kubectl --context cluster1 -n default run tmp-shell --rm -i --tty --image quay.io/submariner/nettest -- /bin/bash
I wanted to test Submariner on normal K8s , OCP etc. and try to establish Cross cluster connectivity and see if that works so I will continue my research on diff Machines and Flavors of K8s but now at least I can test it on Kind using Calico |
Did updating the fs.inotify.max_user_watches and fs.inotify.max_user_instances values fix things in your environment? Sure, we can update docs if needed. please let me know how it goes with Submariner testing on non-Kind clusters. |
I run
when I followed the steps in this doc as well but, I was getting below error
|
I have created 2 K8s cluster which has Pod and Service CIDRs as below
These clusters have Calico as CNI
Since the CIDRs overlap we need to use global net
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: ippools.crd.projectcalico.org
spec:
conversion:
strategy: None
group: crd.projectcalico.org
names:
kind: IPPool
listKind: IPPoolList
plural: ippools
singular: ippool
scope: Cluster
versions:
- name: v1
schema:
openAPIV3Schema:
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: IPPoolSpec contains the specification for an IPPool resource.
properties:
allowedUses:
description: AllowedUse controls what the IP pool will be used for. If
not specified or empty, defaults to ["Tunnel", "Workload"] for back-compatibility
items:
type: string
type: array
blockSize:
description: The block size to use for IP address assignments from
this pool. Defaults to 26 for IPv4 and 122 for IPv6.
type: integer
cidr:
description: The pool CIDR.
type: string
disableBGPExport:
description: 'Disable exporting routes from this IP Pool''s CIDR over
BGP. [Default: false]'
type: boolean
disabled:
description: When disabled is true, Calico IPAM will not assign addresses
from this pool.
type: boolean
ipip:
description: 'Deprecated: this field is only used for APIv1 backwards
compatibility. Setting this field is not allowed, this field is
for internal use only.'
properties:
enabled:
description: When enabled is true, ipip tunneling will be used
to deliver packets to destinations within this pool.
type: boolean
mode:
description: The IPIP mode. This can be one of "always" or "cross-subnet". A
mode of "always" will also use IPIP tunneling for routing to
destination IP addresses within this pool. A mode of "cross-subnet"
will only use IPIP tunneling when the destination node is on
a different subnet to the originating node. The default value
(if not specified) is "always".
type: string
type: object
ipipMode:
description: Contains configuration for IPIP tunneling for this pool.
If not specified, then this is defaulted to "Never" (i.e. IPIP tunneling
is disabled).
type: string
nat-outgoing:
description: 'Deprecated: this field is only used for APIv1 backwards
compatibility. Setting this field is not allowed, this field is
for internal use only.'
type: boolean
natOutgoing:
description: When natOutgoing is true, packets sent from Calico networked
containers in this pool to destinations outside of this pool will
be masqueraded.
type: boolean
nodeSelector:
description: Allows IPPool to allocate for a specific node by label
selector.
type: string
vxlanMode:
description: Contains configuration for VXLAN tunneling for this pool.
If not specified, then this is defaulted to "Never" (i.e. VXLAN
tunneling is disabled).
type: string
required:
- cidr
type: object
type: object
served: true
storage: true
status:
acceptedNames:
kind: IPPool
listKind: IPPoolList
plural: ippools
singular: ippool
conditions:
- lastTransitionTime: "2025-01-06T04:44:50Z"
message: no conflicts found
reason: NoConflicts
status: "True"
type: NamesAccepted
- lastTransitionTime: "2025-01-06T04:44:50Z"
message: the initial names have been accepted
reason: InitialNamesAccepted
status: "True"
type: Established
storedVersions:
- v1 So we need to use IPPool CR with API Group |
@yboaron I have Tried Submariner directly on K8s cluster with Calico and Globalnet, Installed Calico API Server so that
Then tested Submariner
subctl diagnose all
@yboaron could you please advise? |
Subctl gather |
Working Kind Cluster subctl gather |
Seems like the working cluster has vxlan encapsulation between nodes and the one has issue uses IPinIP mode . @yboaron do we have an issue with IPIP mode when using Calico? |
Yep, we have tested Submariner when Calico using VxLAN encapsulation . |
Does it look correct Aswin 🔥🔥🔥 $ kubectl get ippools -A --kubeconfig rohan -o yaml
apiVersion: v1
items:
- apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: "2025-01-06T04:45:09Z"
name: default-ipv4-ippool
resourceVersion: "310349"
uid: 29a06c8c-47de-4ef7-80a0-fd7bf99624ec
spec:
allowedUses:
- Workload
- Tunnel
blockSize: 26
cidr: 192.168.0.0/16
ipipMode: Never
natOutgoing: true
nodeSelector: all()
vxlanMode: Always
- apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: "2025-01-07T11:01:45Z"
labels:
submariner.io/ippool: "true"
name: submariner-shibu-244.1.0.0-16
resourceVersion: "315166"
uid: 8bcb804d-24fc-427a-a29d-7b8fcb3b3907
spec:
allowedUses:
- Workload
- Tunnel
blockSize: 26
cidr: 244.1.0.0/16
disableBGPExport: true
disabled: true
ipipMode: Never
nodeSelector: all()
vxlanMode: Never
kind: List
metadata:
resourceVersion: ""
Aswin 🔥🔥🔥 $ Aswin 🔥🔥🔥 $ kubectl get ippools -A --kubeconfig shibu -o yaml
apiVersion: v1
items:
- apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: "2025-01-06T04:57:58Z"
name: default-ipv4-ippool
resourceVersion: "351735"
uid: 333e2785-699e-495e-87f4-e4783e6eab6c
spec:
allowedUses:
- Workload
- Tunnel
blockSize: 26
cidr: 192.168.0.0/16
ipipMode: Never
natOutgoing: true
nodeSelector: all()
vxlanMode: Always
- apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: "2025-01-07T11:01:47Z"
labels:
submariner.io/ippool: "true"
name: submariner-rohan-244.0.0.0-16
resourceVersion: "355951"
uid: 8935212e-62bd-4949-9295-a7f07b5d3f53
spec:
allowedUses:
- Workload
- Tunnel
blockSize: 26
cidr: 244.0.0.0/16
disableBGPExport: true
disabled: true
ipipMode: Never
nodeSelector: all()
vxlanMode: Never
kind: List
metadata:
resourceVersion: ""
Aswin 🔥🔥🔥 $
|
We need to change in the CNI configuration not just IP pools. |
Since this setup is not used anymore and we cannot validate if changing the encapsulation solves it, shall we close these issue? Please feel free to reopen if occurs again. |
What happened:
I deployed Submariner with GlobalNet across two Kubernetes clusters. DNS resolution works as expected, but connectivity to services across clusters fails with a
Host is unreachable
error.More info is available in below link
https://kubernetes.slack.com/archives/C010RJV694M/p1730390376380879
What you expected to happen:
curl
requests from a pod in cluster2 to a service exposed via Submariner in cluster1 should succeed, indicating that cross-cluster communication is functioning.How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
subctl diagnose all
):Cluster 1 info
Cluster 2 info
subctl gather
):sub1.zip
sub2.zip
K8S is installed on ubuntu VM
OS INFO
The text was updated successfully, but these errors were encountered: