You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Version of Cadence server, and client(which language)
This is very important to root cause bugs.
Server version: ubercadence/server:0.19.1
Client language: Go
Describe the bug
Cadence server is not able to refresh the Domain cache when the Cassandra domain changes
To Reproduce
Is the issue reproducible?
Yes
Steps to reproduce the behaviour:
Start the cadence server along with Cassandra DB
Rotate the Cassandra pods [imagine any pod issue]
Now cadence will throw {"level":"error","msg":"Error refreshing domain cache","service":"cadence-frontend","error":"gocql: no hosts available in the pool","logging-call-at":"domainCache.go:401","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:401"}
After this cadence will not be able to connect to Cassandra pods since the domain of the Cassandra pods have changes on the pod rotation
Expected behaviour
When Cassandra pods are rotated the domain cache in cadence should be updated
Screenshots
Logs - {"level":"error","ts":"2022-08-09T10:06:26.332Z","msg":"Operation failed with internal error.","service":"cadence-frontend","error":"gocql: no hosts available in the pool","metric-scope":42,"logging-call-at":"persistenceMetricClients.go:812","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/persistence.(*metadataPersistenceClient).updateErrorMetric\n\t/cadence/common/persistence/persistenceMetricClients.go:812\ngithub.com/uber/cadence/common/persistence.(*metadataPersistenceClient).GetMetadata\n\t/cadence/common/persistence/persistenceMetricClients.go:790\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshDomainsLocked\n\t/cadence/common/cache/domainCache.go:425\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshDomains\n\t/cadence/common/cache/domainCache.go:412\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:396"}
{"level":"error","ts":"2022-08-09T10:06:26.332Z","msg":"Error refreshing domain cache","service":"cadence-frontend","error":"gocql: no hosts available in the pool","logging-call-at":"domainCache.go:401","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:401"}
Additional context
Add any other context about the problem here, E.g. Stackstace, workflow history.
The text was updated successfully, but these errors were encountered:
Looks like the query is failing on the storage layer. How is your Cassandra (or the storage you are using) metrics looking? You might need to scale up or out your storage.
Apart from this, just to check if your storage is running at all, are you able run workflows?
Version of Cadence server, and client(which language)
This is very important to root cause bugs.
Describe the bug
Cadence server is not able to refresh the Domain cache when the Cassandra domain changes
To Reproduce
Is the issue reproducible?
Steps to reproduce the behaviour:
{"level":"error","msg":"Error refreshing domain cache","service":"cadence-frontend","error":"gocql: no hosts available in the pool","logging-call-at":"domainCache.go:401","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:401"}
Expected behaviour
Screenshots
Logs -
{"level":"error","ts":"2022-08-09T10:06:26.332Z","msg":"Operation failed with internal error.","service":"cadence-frontend","error":"gocql: no hosts available in the pool","metric-scope":42,"logging-call-at":"persistenceMetricClients.go:812","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/persistence.(*metadataPersistenceClient).updateErrorMetric\n\t/cadence/common/persistence/persistenceMetricClients.go:812\ngithub.com/uber/cadence/common/persistence.(*metadataPersistenceClient).GetMetadata\n\t/cadence/common/persistence/persistenceMetricClients.go:790\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshDomainsLocked\n\t/cadence/common/cache/domainCache.go:425\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshDomains\n\t/cadence/common/cache/domainCache.go:412\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:396"}
{"level":"error","ts":"2022-08-09T10:06:26.332Z","msg":"Error refreshing domain cache","service":"cadence-frontend","error":"gocql: no hosts available in the pool","logging-call-at":"domainCache.go:401","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:401"}
Additional context
Add any other context about the problem here, E.g. Stackstace, workflow history.
The text was updated successfully, but these errors were encountered: