Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics with default jvm.json not working #53

Closed
myllynen opened this issue Mar 20, 2018 · 6 comments
Closed

Metrics with default jvm.json not working #53

myllynen opened this issue Mar 20, 2018 · 6 comments

Comments

@myllynen
Copy link
Contributor

With Java 8 on RHEL 7 and Parfait agent compiled from git master:

$ cd parfait.git
$ mvn clean package
$ java -javaagent:$(pwd)/parfait-agent/target/parfait-agent-jar-with-dependencies.jar KeyboardReader
[main] INFO io.pcp.parfait.pcp.PcpMonitorBridge - PCP monitoring bridge started for writer [PcpMmvWriter[byteBufferFactory=FileByteBufferFactory[file=/var/lib/pcp/tmp/mmv/KeyboardReader]]]
Enter a line of input
$ pmrep mmv   
Invalid metric mmv.KeyboardReader.java.memorypool.tenured (PM_ERR_INDOM Unknown or illegal instance domain identifier).

Only a very few metrics (e.g., mmv.KeyboardReader.java.jvm.classloader.loaded.*) are actually working.

@natoscott
Copy link
Member

@myllynen could you send me a copy of /var/lib/pcp/tmp/mmv/KeyboardReader? Thanks.

@myllynen
Copy link
Contributor Author

I doublechecked everything else and I can't see anything else obviously wrong; this is with:

java-1.8.0-openjdk-headless-1.8.0.161-0.b14.el7_4.x86_64
pcp-3.12.2-5.el7.x86_64

and in addition to the above steps I've made /var/lib/pcp/tmp/mmv accessible for the user, otherwise no Java / PCP configuration changes locally, SELinux permissive, under /etc/parfait only the jvm.json is present and it is identical to current git master version.

KeyboardReader.gz

@myllynen
Copy link
Contributor Author

I've now tested this on a few more systems and it more and more looks like this is not a local issue, it can be observed also on freshly installed Fedora 27 VM (either with PCP 4.0 or git master), the only system where this works is RHEL 7.4 with:

java-1.8.0-openjdk-headless-1.8.0.161-0.b14.el7_4.x86_64
pcp-3.12.2-4.el7.x86_64

@natoscott
Copy link
Member

@myllynen I think this may be a pmrep/python/fetchgroup/pmdammv issue related to handling instance domains in a slightly unusual state.

pminfo -df mmv shows correct output using the KeyboardReader.gz MMV file. I suspect metrics like java.memorypool.tenured, java.memorypool.survivor etc have no values on this JVM version.

For reference, you can examine the generated KeyboardReader file directly (taking java/parfait totally out of the picture), using /var/lib/pcp/pmdas/mmv/mmvdump - this shows a valid MMV files with the usual set of default Java metrics. You can use qa/src/mmv_poke from the pcp repo to modify the file (esp. the PID) such that it becomes active from pmdammv(1) perspective, and the metrics can then be inspected directly using pminfo and other client tools (again, this is just to isolate things from the java/Parfait aspects, which don't appear to be at issue here).

@myllynen
Copy link
Contributor Author

myllynen commented Apr 4, 2018

Hmm, ok so it seems like there are two unrelated issues here:

  1. Handling the values seems to have changed; with 3.12.2 on RHEL 7.4:
[root@rhel-7-server /]# pminfo -dfmtT mmv.KeyboardReader.java.memorypool.tenured 

mmv.KeyboardReader.java.memorypool.tenured PMID: 70.2986.876 [Virtual memory size for tenured space]
    Data Type: 64-bit int  InDom: 70.2064236 0x119f7f6c
    Semantics: instant  Units: byte
Help:
Virtual memory size for tenured space
No value(s) available!
[root@rhel-7-server /]# pmval mmv.KeyboardReader.java.memorypool.tenured
pmval: pmGetInDom(70.2064236): Unknown or illegal instance domain identifier
[root@rhel-7-server /]# pmrep -s 2 mmv.KeyboardReader.java.memorypool.tenured
  m.K.j.m.tenured
             byte
              N/A
              N/A
[root@rhel-7-server /]# 

And with all versions since then (this is from Fedora 27 with PCP 4.0):

[root@localhost /]# pminfo -dfmtT mmv.KeyboardReader.java.memorypool.tenured

mmv.KeyboardReader.java.memorypool.tenured PMID: 70.2986.876 [Virtual memory size for tenured space]
    Data Type: 64-bit int  InDom: 70.2064236 0x119f7f6c
    Semantics: instant  Units: byte
Help:
Virtual memory size for tenured space
No value(s) available!
[root@localhost /]# pmval mmv.KeyboardReader.java.memorypool.tenured
pmval: pmGetInDom(70.2064236): Unknown or illegal instance domain identifier
[root@localhost /]# pmrep -s 2 mmv.KeyboardReader.java.memorypool.tenured
Invalid metric mmv.KeyboardReader.java.memorypool.tenured (PM_ERR_INDOM Unknown or illegal instance domain identifier).
[root@localhost /]# 

IOW, looks like things are consistent in recent releases how pmval/pmrep/etc handle those metrics.

  1. Default jvm.json is in need of update, we also have issue Add all supported JVM JMX metrics to default JSON file #49 and issue Fix default configuration for recent Java versions #57, so perhaps could be covered as part of those. I'd suggest the default should work with OpenJDK 8 (current Java LTS release and probably the most widely-deployed Java version today), having different JSON files for different JVM features/versions would probably make sense to allow more easily to solve issues like these (and if these metrics are part of some already EOL'd Java release, perhaps they could be provided in a non-default directory for those still needing them so that users on supported Java versions would not need to deal with them).

Thanks.

@natoscott
Copy link
Member

The root cause of the underlying indom issue here is now understood. It's a logic error in pmdammv, fix will be in pcp-4.1.0 and later versions of PCP.

natoscott added a commit to performancecopilot/pcp that referenced this issue Jun 1, 2018
Several people (marko, lzap, tallpaul) have reported this
one, finally got to the bottom of it.  The symptoms are
"Unknown or illegal instance domain identifier" errors on
indom lookups, sometimes.  Root cause was a logic error
in pmdammv indom setup code incorrectly overwriting count
and offset local variables while parsing mappings.

To exercise the fix I've modernized qa/src/indom.c and
used it in new test qa/1422 to tickle the problem using
a canned MMV mapping which is known to expose it.

The original bug report (against Parfait) is this one:
performancecopilot/parfait#53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants