Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cassandra exporter (0.9.12) still do not export metrics of cassandra 4.0.5 #109

Open
mindaugaszilionis opened this issue Feb 17, 2023 · 8 comments

Comments

@mindaugaszilionis
Copy link

Hi, tried replace 0.9.10 cassandra exporter agent with nwly released 0.9.12 exporter version. Dont see any difference - metrics still "loading" for ages and do not open - just like in previous version of exporter.

@johndelcastillo
Copy link

Hi, thanks for the report.

Are you running the agent or standalone version?

When you say "loading", where exactly are you seeing that, prometheus?

Are you able to query the metrics api directly on the node and get any results?
E.g: http://localhost:9500/metrics

Cheers

@mindaugaszilionis
Copy link
Author

Hi, i use agent version and add it into classpath by
JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/cassandra-exporter-agent-0.9.12.jar"

when i try to load metrics, it is taking ages and do not respond (after some minutes i just cancel)

[root@***]# wget localhost:9500/metrics
--2023-03-08 14:18:56-- http://localhost:9500/metrics
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:9500... failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:9500... connected.
HTTP request sent, awaiting response...

process looks like
cassand+ 1795408 1 99 Feb28 ? 12-04:37:15 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-2.el8_6.x86_64/jre/bin/java -ea -da:net.openhft... -XX:+UseThreadPriorities -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseNUMA -XX:+PerfDisableSharedMem -Djava.net.preferIPv4Stack=true -Xms64G -Xmx64G -XX:ThreadPriorityPolicy=42 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSWaitDuration=10000 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -Xloggc:/var/log/cassandra/gc.log -Xmn2048M -XX:+UseCondCardMark -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -javaagent:/usr/share/cassandra/lib/jamm-0.3.2.jar -Dcassandra.jmx.remote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password -Dcom.sun.management.jmxremote.access.file=/etc/cassandra/jmxremote.access -Djava.library.path=/usr/share/cassandra/lib/sigar-bin -javaagent:/usr/share/cassandra/lib/cassandra-exporter-agent-0.9.12.jar -XX:OnOutOfMemoryError=kill -9 %p -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra/conf:/usr/share/cassandra/lib/airline-0.8.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/asm-7.1.jar:/usr/share/cassandra/lib/caffeine-2.5.6.jar:/usr/share/cassandra/lib/cassandra-driver-core-3.11.0-shaded.jar:/usr/share/cassandra/lib/cassandra-exporter-agent-0.9.12.jar:/usr/share/cassandra/lib/chronicle-bytes-2.20.111.jar:/usr/share/cassandra/lib/chronicle-core-2.20.126.jar:/usr/share/cassandra/lib/chronicle-queue-5.20.123.jar:/usr/share/cassandra/lib/chronicle-threads-2.20.111.jar:/usr/share/cassandra/lib/chronicle-wire-2.20.117.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.9.jar:/usr/share/cassandra/lib/commons-lang3-3.11.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/concurrent-trees-2.4.0.jar:/usr/share/cassandra/lib/ecj-4.6.1.jar:/usr/share/cassandra/lib/guava-27.0-jre.jar:/usr/share/cassandra/lib/HdrHistogram-2.1.9.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/hppc-0.8.1.jar:/usr/share/cassandra/lib/j2objc-annotations-1.3.jar:/usr/share/cassandra/lib/jackson-annotations-2.13.2.jar:/usr/share/cassandra/lib/jackson-core-2.13.2.jar:/usr/share/cassandra/lib/jackson-databind-2.13.2.2.jar:/usr/share/cassandra/lib/jamm-0.3.2.jar:/usr/share/cassandra/lib/java-cup-runtime-11b-20160615.jar:/usr/share/cassandra/lib/javax.inject-1.jar:/usr/share/cassandra/lib/jbcrypt-0.4.jar:/usr/share/cassandra/lib/jcl-over-slf4j-1.7.25.jar:/usr/share/cassandra/lib/jcommander-1.30.jar:/usr/share/cassandra/lib/jctools-core-3.1.0.jar:/usr/share/cassandra/lib/jflex-1.8.2.jar:/usr/share/cassandra/lib/jna-5.6.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/jvm-attach-api-1.5.jar:/usr/share/cassandra/lib/log4j-over-slf4j-1.7.25.jar:/usr/share/cassandra/lib/logback-classic-1.2.9.jar:/usr/share/cassandra/lib/logback-core-1.2.9.jar:/usr/share/cassandra/lib/lz4-java-1.8.0.jar:/usr/share/cassandra/lib/metrics-core-3.1.5.jar:/usr/share/cassandra/lib/metrics-jvm-3.1.5.jar:/usr/share/cassandra/lib/metrics-logback-3.1.5.jar:/usr/share/cassandra/lib/mxdump-0.14.jar:/usr/share/cassandra/lib/netty-all-4.1.58.Final.jar:/usr/share/cassandra/lib/netty-tcnative-boringssl-static-2.0.36.Final.jar:/usr/share/cassandra/lib/ohc-core-0.5.1.jar:/usr/share/cassandra/lib/ohc-core-j8-0.5.1.jar:/usr/share/cassandra/lib/psjava-0.1.19.jar:/usr/share/cassandra/lib/reporter-config3-3.0.3.jar:/usr/share/cassandra/lib/reporter-config-base-3.0.3.jar:/usr/share/cassandra/lib/sigar-1.6.4.jar:/usr/share/cassandra/lib/sjk-cli-0.14.jar:/usr/share/cassandra/lib/sjk-core-0.14.jar:/usr/share/cassandra/lib/sjk-json-0.14.jar:/usr/share/cassandra/lib/sjk-stacktrace-0.14.jar:/usr/share/cassandra/lib/slf4j-api-1.7.25.jar:/usr/share/cassandra/lib/snakeyaml-1.26.jar:/usr/share/cassandra/lib/snappy-java-1.1.2.6.jar:/usr/share/cassandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/zstd-jni-1.5.0-4.jar:/usr/share/cassandra/apache-cassandra-4.0.5.jar:/usr/share/cassandra/fqltool.jar:/usr/share/cassandra/stress.jar: org.apache.cassandra.service.CassandraDaemon

@st-gra
Copy link

st-gra commented Jun 5, 2023

@mindaugaszilionis@johndelcastillo Were you able to find a solution for this? I am facing the same issue currently in my environment.

@itskarlsson
Copy link

itskarlsson commented Sep 15, 2023

The problem here is that versions are not working with 4.0.x. You get a stacktrace in the system.log and the metrics will load forever. I created a quick patch to make it work in #114.

@mindaugaszilionis
Copy link
Author

mindaugaszilionis commented Oct 6, 2023

actually same problem with casandra 4.1.3
in casssandra system logs there is error:
WARN [prometheus-netty-pool-0] 2023-10-06 14:51:27,373 DefaultChannelPipeline.java:1152 - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.nio.BufferOverflowException: null
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:222)
at com.zegelin.prometheus.exposition.NioExpositionSink.writeBytes(NioExpositionSink.java:27)
at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.writeLabels(TextFormatMetricFamilyWriter.java:111)
at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.writeLabelSets(TextFormatMetricFamilyWriter.java:129)
at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.writeMetric(TextFormatMetricFamilyWriter.java:141)
at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.lambda$visit$4(TextFormatMetricFamilyWriter.java:181)
at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.lambda$metricW

@mindaugaszilionis
Copy link
Author

some changes appeared, when i set rpc_address: 0.0.0.0 in cassandra.yaml, i get
[root@l160c-cass-c6n1 ~]# wget localhost:9500/metrics
--2023-10-06 17:07:10-- http://localhost:9500/metrics
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:9500... failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:9500... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘metrics.1’

metrics.1 [ <=> ] 0 --.-KB/s in 0s

2023-10-06 17:07:10 (0.00 B/s) - Read error at byte 0 (Success).Retrying.

--2023-10-06 17:07:11-- (try: 2) http://localhost:9500/metrics
Connecting to localhost (localhost)|127.0.0.1|:9500... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘metrics.1’

metrics.1
some progress, but metrics still not available

@sonman
Copy link

sonman commented Oct 28, 2023

The release from edgelaborities has fixed the issue for me. (AFAIK because they just merged #84)
See also #83.

@mindaugaszilionis
Copy link
Author

as workaround, cassandra exporter started to work when table metrics were disabled
JVM_OPTS="$JVM_OPTS -javaagent:/usr/share/cassandra/lib/cassandra-exporter-agent-0.9.14.jar=--table-metrics=NONE"
maybe root couse might be related that we have lots of tables - hundrets of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants