You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's currently a conflicting problem with the Pulsar k8s deployment and how Pulsar load balancing works.
When a Pulsar broker starts, it will register itself as a broker in the internal Pulsar load balancer. Pulsar load balancer might immediately assign new namespace bundles to the broker and the topics might immediately get requests.
The conflicting problem is that DNS resolution for the broker's host name will fail with the current settings until the broker's readiness probe succeeds.
Pulsar might already return the hostname of a specific broker to a client, but the client cannot resolve the DNS name since the broker's readiness probe hasn't passed. This causes extra delays and also bugs when connecting to topics after a load balancing event. Pulsar clients usually backoff and retry. For Admin API HTTP requests, clients might not properly handle errors and for example Pulsar Proxy will fail the request when there's a DNS lookup issue.
solution:
Broker statefulset's service should use publishNotReadyAddresses: true
There's useful information about stateful sets and publishNotReadyAddresses setting: k8ssandra/cass-operator#18
There's an alternative solution in #198 which is fine for cases where TLS is disabled for brokers. Stable hostnames are required when using TLS to be able to do hostname verification for the certificates.
The text was updated successfully, but these errors were encountered:
I made an experiment to add a new service and make the broker sts use this service: 259341c
The problem is that it's not possible to change the serviceName for a STS:
Error: UPGRADE FAILED: cannot patch "pulsar-testenv-pulsar-broker" with kind StatefulSet: StatefulSet.apps "pulsar-testenv-pulsar-broker" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden
We would like to have 2 service for the broker STS:
1 service that uses publishNotReadyAddresses: true
another service that doesn't use publishNotReadyAddresses: true. This would be used to redirect traffic hitting the service only to brokers that pass the readiness probe.
It doesn't seem to be possible to keep backwards compatibility for existing deployments with the above requirements.
@lhotari To support the upgrade path, can you switch the purpose of the services? So you don't have to modify the StatefulSet, use the existing name for the service that does use publishNotReadyAddresses: true setting and a new service that does? The proxy should point to the service that only routes traffic if the broker is ready, so that the proxy doesn't send traffic to a broker that can't handle it.
pgier
pushed a commit
to pgier/datastax-pulsar-helm-chart
that referenced
this issue
Jul 13, 2022
There's currently a conflicting problem with the Pulsar k8s deployment and how Pulsar load balancing works.
When a Pulsar broker starts, it will register itself as a broker in the internal Pulsar load balancer. Pulsar load balancer might immediately assign new namespace bundles to the broker and the topics might immediately get requests.
The conflicting problem is that DNS resolution for the broker's host name will fail with the current settings until the broker's readiness probe succeeds.
Pulsar might already return the hostname of a specific broker to a client, but the client cannot resolve the DNS name since the broker's readiness probe hasn't passed. This causes extra delays and also bugs when connecting to topics after a load balancing event. Pulsar clients usually backoff and retry. For Admin API HTTP requests, clients might not properly handle errors and for example Pulsar Proxy will fail the request when there's a DNS lookup issue.
solution:
Broker statefulset's service should use
publishNotReadyAddresses: true
There's useful information about stateful sets and publishNotReadyAddresses setting:
k8ssandra/cass-operator#18
There's an alternative solution in #198 which is fine for cases where TLS is disabled for brokers. Stable hostnames are required when using TLS to be able to do hostname verification for the certificates.
The text was updated successfully, but these errors were encountered: