Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(reports): implement report generation #51

Merged
merged 14 commits into from
Aug 23, 2023
Merged

feat(reports): implement report generation #51

merged 14 commits into from
Aug 23, 2023

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Aug 11, 2023

Re-implements report generation for active and archived recordings. There is a cache, but it does not proactively invalidate entries - active recording reports will stay in the cache until they expire, even if the source recording is deleted or the source target disappears. Likewise, archived recording reports will stay in the cache until they expire, even if the source archived recording is deleted. These cases are still WIP and will probably need to be handled by listening for these events on the internal EventBus. The cache entries will eventually time out and expire, and the cache entries are keyed on unique IDs, so stale cache entries should not cause invalid data to be returned, they will only end up temporarily wasting some memory.

HTML report generation is intentionally not re-implemented. The expectation is that the web-client will stop rendering the pre-formatted HTML reports and instead consume only the JSON form, so the API no longer provides HTML. With the current state of the web-client this simply results in unformatted raw JSON being presented in the views that previously would render pre-formatted HTML.

Currently, due to complications with the caching strategy (particularly for active recordings), the query parameter for specifying a report predicate is also not implemented. The full report JSON will be generated on every request, and the full report cached, and the full report returned to the client, regardless of whether the client asks for only a subset of evaluations. I am still deciding what to do in this case - at the least, only the requested matching evaluations should be returned to the client to minimize network utilization on the report transfer, but I'm not quite sure how or if I want to handle "partial evaluations" of the report.

TODO: refactoring to extract some S3 logic now that it's starting to proliferate around the codebase.

@andrewazores andrewazores added the feat New feature or request label Aug 11, 2023
@andrewazores andrewazores changed the title WIP feat(reports): implement report generation feat(reports): implement report generation Aug 11, 2023
@andrewazores
Copy link
Member Author

@aali309 @mwangggg review as well please

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, I'll look at functionality a bit later.

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On an archive file:

    {
        "archivedTime": 1692745013,
        "downloadUrl": "/api/v3/download/Y3d6NGI0VkRYcEFlaWdPWklaSUlRTUZlOXFEWl9Wc2lfLVVEMk94akJaUT0vY29tcG9zZS1zYW1wbGUtYXBwLTFfYXNkZl8yMDIzMDgyMlQyMjU2NTJaLmpmcg",
        "metadata": {
            "labels": {
                "template.name": "Continuous",
                "template.type": "TARGET"
            }
        },
        "name": "compose-sample-app-1_asdf_20230822T225652Z.jfr",
        "reportUrl": "/api/v3/reports/Y3d6NGI0VkRYcEFlaWdPWklaSUlRTUZlOXFEWl9Wc2lfLVVEMk94akJaUT0vY29tcG9zZS1zYW1wbGUtYXBwLTFfYXNkZl8yMDIzMDgyMlQyMjU2NTJaLmpmcg",
        "size": 20971520
    }

I tried downloading the archived jfr file like so:

$ http -dF http://localhost:8181/api/v3/download/Y3d6NGI0VkRYcEFlaWdPWklaSUlRTUZlOXFEWl9Wc2lfLVVEMk94akJaUT0vY29tcG9zZS1zYW1wbGUtYXBwLTFfYXNkZl8yMDIzMDgyMlQyMjU2NTJaLmpmcg --auth=user:pass

but after validating it jfr, I get jfr print: could not read recording at /home/mcao/workspace/cryostat3/jfr.jfr. Not a Flight Recorder file
I also tried with JMC and that didn't work either.

Next I tried the report.

❯ http -F http://localhost:8181/api/v3/reports/Y3d6NGI0VkRYcEFlaWdPWklaSUlRTUZlOXFEWl9Wc2lfLVVEMk94akJaUT0vY29tcG9zZS1zYW1wbGUtYXBwLTFfYXNkZl8yMDIzMDgyMlQyMjU2NTJaLmpmcg --auth=user:pass 
HTTP/1.1 200 OK
Cache-Control: no-cache
Content-Type: application/json;charset=UTF-8
content-encoding: gzip
content-length: 147

{
    "class org.openjdk.jmc.flightrecorder.internal.InvalidJfrFileException": {
        "description": null,
        "name": "InvalidJfrFileException",
        "score": -1.0,
        "topic": "report_failure"
    }
}

Not sure if that's right...

@andrewazores
Copy link
Member Author

andrewazores commented Aug 23, 2023

Sounds like the file probably got corrupted (truncated or something or re-encoded wrong) on its way into the archives. If you are running the smoketest, open up localhost:9000 to get to the Minio S3 console and try downloading the JFR file directly from the archive bucket. This way you're getting the JFR file directly as it exists in storage and you can test if the broken functionality is in the report generation or downloading streams, or if it's actually a problem with how the recording got archived to begin with.

@andrewazores
Copy link
Member Author

Interesting - I was just able to reproduce that problem on the first try using the same sample-app target as a source, but if I try the same thing on one of the recordings that was pushed into the archives by the agent instance then it works fine.

@andrewazores
Copy link
Member Author

I also have the same failure using the jfr-datasource instance as a target, but all the ones pushed by the agent instance seem to be fine. So I think this is actually a bug in how the Cryostat server is retrieving JFR files over JMX and copying them into the archives.

@andrewazores
Copy link
Member Author

Yep, that must be it. Both "view in grafana" (jfr-datasource processing) and report generation work on an active recording started in ex. cryostat3 itself, but if I copy that same recording to the archives, then both features fail.

@andrewazores
Copy link
Member Author

Yep, the copying into archives definitely appears broken. I pulled the file straight back out of Minio and it's still unreadable by jfr. Comparing hex dumps of both files (from Minio console direct vs pulled through the Cryostat presigned URL API) shows they're identical.

@andrewazores
Copy link
Member Author

Think I solved it - just some dumb improper handling of the in-transit copy buffer.

@maxcao13
Copy link
Member

Reports look good now:

{
    "Allocations.class": {
        "description": "",
        "name": "Allocated Classes",
        "score": -1.0,
        "topic": "heap"
    },
    "Allocations.thread": {
        "description": "",
        "name": "Threads Allocating",
        "score": -1.0,
        "topic": "java_application"
    }
...

and jfr looks good too:

jfr summary download.jfr 

 Version: 2.1
 Chunks: 1
 Start: 2023-08-23 19:37:48 (UTC)
 Duration: 0 s

 Event Type                                   Count  Size (bytes) 
==================================================================
 jdk.ModuleExport                               668          7179
...

@andrewazores andrewazores merged commit ef64b3a into cryostatio:main Aug 23, 2023
3 of 5 checks passed
@andrewazores andrewazores deleted the report-generation branch August 23, 2023 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants