Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(triggers): Add Support for defining MBean Triggers to the cryostat agent #197

Merged
merged 37 commits into from
Oct 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
d455dc3
Initial implementation
Josh-Matsuoka Sep 12, 2023
7e341d2
Added support for trigger durations
Josh-Matsuoka Sep 19, 2023
14ca37b
Merge remote-tracking branch 'upstream/main' into triggers
Josh-Matsuoka Sep 19, 2023
ed6f4dd
Refactoring FlightRecorder code and cleanup
Josh-Matsuoka Sep 22, 2023
32c4f10
Reworking TriggerEvaluator into a Runnable, updating README
Josh-Matsuoka Sep 26, 2023
37e302c
Rebasing with upstream
Josh-Matsuoka Sep 27, 2023
0f97706
Addressing review feedback, fixing MbeanContext
Josh-Matsuoka Sep 27, 2023
41d441d
Merge branch 'main' into triggers
andrewazores Sep 27, 2023
b345782
handle empty args array
andrewazores Sep 27, 2023
75114f7
refactor
andrewazores Sep 27, 2023
74a8ad8
make evaluation period configurable
andrewazores Sep 27, 2023
8203b7f
formatting
andrewazores Sep 27, 2023
de4ca85
document smart trigger config option
andrewazores Sep 27, 2023
62246dd
cleanup
andrewazores Sep 27, 2023
c8852e0
fix hashcode/equals
andrewazores Sep 27, 2023
f3b081b
correct examples
andrewazores Sep 28, 2023
e188116
correct regex matching of template name/label
andrewazores Sep 28, 2023
cc8d9a2
log triggers when registered
andrewazores Sep 28, 2023
b360db5
mark fields volatile
andrewazores Sep 28, 2023
f24a08b
handle null field
andrewazores Sep 28, 2023
ec34491
style fixup
andrewazores Sep 28, 2023
c8f42e6
restore #195
andrewazores Sep 28, 2023
c022051
correct examples
andrewazores Sep 28, 2023
1fbec90
fix recording creation/start
andrewazores Sep 28, 2023
c70331d
handle exceptions, add debug logging
andrewazores Sep 28, 2023
e953012
add FIXME
andrewazores Sep 28, 2023
8f86a1b
add README note
andrewazores Sep 28, 2023
5a64522
log recording start
andrewazores Sep 28, 2023
2e7238f
remove unused client accessor
andrewazores Sep 28, 2023
83507f5
fix bad case in parsing
andrewazores Sep 29, 2023
79aef77
pass correct duration for evaluation
andrewazores Sep 29, 2023
6cc049e
refactoring, duration handling fixup
andrewazores Sep 29, 2023
7c2e713
Merge remote-tracking branch 'upstream/main' into triggers
andrewazores Sep 29, 2023
e6bbc4e
spotless
andrewazores Oct 2, 2023
cb626ca
duration parsing fixup
andrewazores Oct 2, 2023
478bed1
fixup! duration parsing fixup
andrewazores Oct 2, 2023
2fb3bac
fixup! fixup! duration parsing fixup
andrewazores Oct 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,41 @@ JAVA_OPTIONS="-Dcom.sun.management.jmxremote.port=9091 -Dcom.sun.management.jmxr
```
This assumes that the agent JAR has been included in the application image within `/deployments/app/`.

## SMART TRIGGERS

`cryostat-agent` supports smart triggers that listen to the values of the MBean Counters and can start recordings based on a set of constraints specified by the user.
The general form of a smart trigger expression is as follows:

```
[constraint1(&&/||)constraint2...constraintN]~recordingTemplate
```

An example for listening to CPU Usage and starting a recording using the Profiling template when it exceeds 0.2%:

```
[ProcessCpuLoad>0.2]~profile
```

An example for watching for the Thread Count to exceed 20 for longer than 10 seconds and starting a recording using the Continuous template:

```
[ThreadCount>20&&TargetDuration>duration("10s")]~Continuous
```

These must be passed as an argument to the cryostat agent, for example:

```
JAVA_OPTIONS="-javaagent:/deployments/app/cryostat-agent-${CRYOSTAT_AGENT_VERSION}.jar=[ProcessCpuLoad>0.2]~profile
```

Multiple smart trigger definitions may be specified and separated by commas, for example:

```
[ProcessCpuLoad>0.2]~profile,[ThreadCount>30]~Continuous
```

**NOTE**: Smart Triggers are evaluated on a polling basis. The poll period is configurable (see list below). This means that your conditions are subject to sampling biases.

## CONFIGURATION

`cryostat-agent` uses [smallrye-config](https://github.com/smallrye/smallrye-config) for configuration.
Expand All @@ -54,7 +89,7 @@ and how it advertises itself to a Cryostat server instance. Required properties
- [ ] `cryostat.agent.app.jmx.port` [`int`]: the JMX RMI port that the application is listening on. The default is to attempt to determine this from the `com.sun.management.jmxremote.port` system property.
- [ ] `cryostat.agent.registration.retry-ms` [`long`]: the duration in milliseconds between attempts to register with the Cryostat server. Default `5000`.
- [ ] `cryostat.agent.exit.signals` [`[String]`]: a comma-separated list of signals that the agent should handle. When any of these signals is caught the agent initiates an orderly shutdown, deregistering from the Cryostat server and potentially uploading the latest harvested JFR data. Default `INT,TERM`.
- [ ] `cryostat.agent.exit.deregistration.timeout-ms` [`long`]: the duration in milliseconds to wait for a response from the Cryostat server when attempting to deregister at shutdown time . Default `3s`.
- [ ] `cryostat.agent.exit.deregistration.timeout-ms` [`long`]: the duration in milliseconds to wait for a response from the Cryostat server when attempting to deregister at shutdown time . Default `3000`.
- [ ] `cryostat.agent.harvester.period-ms` [`long`]: the length of time between JFR collections and pushes by the harvester. This also controls the maximum age of data stored in the buffer for the harvester's managed Flight Recording. Every `period-ms` the harvester will upload a JFR binary file to the `cryostat.agent.baseuri` archives. Default `-1`, which indicates no harvesting will be performed.
- [ ] `cryostat.agent.harvester.template` [`String`]: the name of the `.jfc` event template configuration to use for the harvester's managed Flight Recording. Default `default`, the continuous monitoring event template.
- [ ] `cryostat.agent.harvester.max-files` [`String`]: the maximum number of pushed files that Cryostat will keep over the network from the agent. This is supplied to the harvester's push requests which instructs Cryostat to prune, in a FIFO manner, the oldest JFR files within the attached JVM target's storage, while the number of stored recordings is greater than this configuration's maximum file limit. Default `2147483647` (`Integer.MAX_VALUE`).
Expand All @@ -63,6 +98,7 @@ and how it advertises itself to a Cryostat server instance. Required properties
- [ ] `cryostat.agent.harvester.exit.max-size-b` [`long`]: the JFR `maxsize` setting, specified in bytes, to apply to exit uploads as described above.
- [ ] `cryostat.agent.harvester.max-age-ms` [`long`]: the JFR `maxage` setting, specified in milliseconds, to apply to periodic uploads during the application lifecycle. Defaults to `0`, which is interpreted as 1.5x the harvester period (`cryostat.agent.harvester.period-ms`).
- [ ] `cryostat.agent.harvester.max-size-b` [`long`]: the JFR `maxsize` setting, specified in bytes, to apply to periodic uploads during the application lifecycle. Defaults to `0`, which means `unlimited`.
- [ ] `cryostat.agent.smart-trigger.evaluation.period-ms` [`long`]: the length of time between Smart Trigger evaluations. Default `1000`.

These properties can be set by JVM system properties or by environment variables. For example, the property
`cryostat.agent.baseuri` can be set using `-Dcryostat.agent.baseuri=https://mycryostat.example.com:1234/` or
Expand Down
16 changes: 16 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
<javax.annotation.version>1.3.2</javax.annotation.version><!-- used by smallrye -->
<io.smallrye.config.version>2.12.3</io.smallrye.config.version>
<org.slf4j.version>2.0.7</org.slf4j.version>
<org.projectnessie.cel.bom.version>0.3.21</org.projectnessie.cel.bom.version>

<com.github.spotbugs.version>4.7.3</com.github.spotbugs.version>
<com.github.spotbugs.plugin.version>4.7.3.6</com.github.spotbugs.plugin.version>
Expand All @@ -70,6 +71,17 @@
<org.jsoup.version>1.15.3</org.jsoup.version>
</properties>

<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.projectnessie.cel</groupId>
<artifactId>cel-bom</artifactId>
<version>${org.projectnessie.cel.bom.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>io.cryostat</groupId>
Expand All @@ -82,6 +94,10 @@
<artifactId>dagger</artifactId>
<version>${com.google.dagger.version}</version>
</dependency>
<dependency>
<groupId>org.projectnessie.cel</groupId>
<artifactId>cel-tools</artifactId>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
Expand Down
5 changes: 5 additions & 0 deletions src/main/java/io/cryostat/agent/Agent.java
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
import javax.inject.Named;
import javax.inject.Singleton;

import io.cryostat.agent.triggers.TriggerEvaluator;

import dagger.Component;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand Down Expand Up @@ -87,6 +89,7 @@ public static void main(String[] args) {
});
webServer.start();
registration.start();
client.triggerEvaluator().start(args);
log.info("Startup complete");
} catch (Exception e) {
log.error(Agent.class.getSimpleName() + " startup failure", e);
Expand Down Expand Up @@ -143,6 +146,8 @@ interface Client {

Harvester harvester();

TriggerEvaluator triggerEvaluator();

ScheduledExecutorService executor();

@Named(ConfigModule.CRYOSTAT_AGENT_EXIT_SIGNALS)
Expand Down
10 changes: 10 additions & 0 deletions src/main/java/io/cryostat/agent/ConfigModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ public abstract class ConfigModule {
public static final String CRYOSTAT_AGENT_HARVESTER_MAX_SIZE_B =
"cryostat.agent.harvester.max-size-b";

public static final String CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS =
"cryostat.agent.smart-trigger.evaluation.period-ms";

public static final String CRYOSTAT_AGENT_API_WRITES_ENABLED =
"cryostat.agent.api.writes-enabled";

Expand Down Expand Up @@ -287,4 +290,11 @@ public static List<String> provideCryostatAgentExitSignals(SmallRyeConfig config
public static long provideCryostatAgentExitDeregistrationTimeoutMs(SmallRyeConfig config) {
return config.getValue(CRYOSTAT_AGENT_EXIT_DEREGISTRATION_TIMEOUT_MS, long.class);
}

@Provides
@Singleton
@Named(CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS)
public static long provideCryostatSmartTriggerEvaluationPeriodMs(SmallRyeConfig config) {
return config.getValue(CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS, long.class);
}
}
112 changes: 112 additions & 0 deletions src/main/java/io/cryostat/agent/FlightRecorderHelper.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/*
* Copyright The Cryostat Authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package io.cryostat.agent;

import java.lang.management.ManagementFactory;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collectors;

import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
import jdk.jfr.FlightRecorder;
import jdk.jfr.Recording;
import jdk.management.jfr.ConfigurationInfo;
import jdk.management.jfr.FlightRecorderMXBean;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class FlightRecorderHelper {

private final FlightRecorderMXBean bean =
ManagementFactory.getPlatformMXBean(FlightRecorderMXBean.class);
private final Logger log = LoggerFactory.getLogger(getClass());

// FIXME this is repeated logic shared with Harvester startRecording
public void startRecording(String templateNameOrLabel) {
getTemplate(templateNameOrLabel)
.ifPresentOrElse(
c -> {
long recordingId = bean.newRecording();
bean.setPredefinedConfiguration(recordingId, c.getName());
String recoringName =
String.format("cryostat-smart-trigger-%d", recordingId);
bean.setRecordingOptions(
recordingId, Map.of("name", recoringName, "disk", "true"));
bean.startRecording(recordingId);
log.info(
"Started recording \"{}\" using template \"{}\"",
recoringName,
templateNameOrLabel);
},
() ->
log.error(
"Cannot start recording with template named or labelled {}",
templateNameOrLabel));
}

public Optional<ConfigurationInfo> getTemplate(String nameOrLabel) {
return bean.getConfigurations().stream()
.filter(c -> c.getName().equals(nameOrLabel) || c.getLabel().equals(nameOrLabel))
.findFirst();
}

public boolean isValidTemplate(String nameOrLabel) {
return getTemplate(nameOrLabel).isPresent();
}

public List<RecordingInfo> getRecordings() {
return FlightRecorder.getFlightRecorder().getRecordings().stream()
.map(RecordingInfo::new)
.collect(Collectors.toList());
}

@SuppressFBWarnings(value = "URF_UNREAD_FIELD")
public static class RecordingInfo {

public final long id;
public final String name;
public final String state;
public final Map<String, String> options;
public final long startTime;
public final long duration;
public final boolean isContinuous;
public final boolean toDisk;
public final long maxSize;
public final long maxAge;

RecordingInfo(Recording rec) {
this.id = rec.getId();
this.name = rec.getName();
this.state = rec.getState().name();
this.options = rec.getSettings();
if (rec.getStartTime() != null) {
this.startTime = rec.getStartTime().toEpochMilli();
} else {
this.startTime = 0;
}
this.isContinuous = rec.getDuration() == null;
this.duration = this.isContinuous ? 0 : rec.getDuration().toMillis();
this.toDisk = rec.isToDisk();
this.maxSize = rec.getMaxSize();
if (rec.getMaxAge() != null) {
this.maxAge = rec.getMaxAge().toMillis();
} else {
this.maxAge = 0;
}
}
}
}
33 changes: 33 additions & 0 deletions src/main/java/io/cryostat/agent/MainModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@
import io.cryostat.agent.Harvester.RecordingSettings;
import io.cryostat.agent.remote.RemoteContext;
import io.cryostat.agent.remote.RemoteModule;
import io.cryostat.agent.triggers.TriggerEvaluator;
import io.cryostat.agent.triggers.TriggerParser;
import io.cryostat.core.net.JFRConnection;
import io.cryostat.core.net.JFRConnectionToolkit;
import io.cryostat.core.sys.Environment;
Expand Down Expand Up @@ -69,6 +71,7 @@ public abstract class MainModule {
private static final int NUM_WORKER_THREADS = 3;
private static final String JVM_ID = "JVM_ID";
private static final String TEMPLATES_PATH = "TEMPLATES_PATH";
private static final String TRIGGER_SCHEDULER = "TRIGGER_SCHEDULER";

@Provides
@Singleton
Expand Down Expand Up @@ -270,6 +273,36 @@ public static Harvester provideHarvester(
registration);
}

@Provides
@Singleton
@Named(TRIGGER_SCHEDULER)
public static ScheduledExecutorService provideTriggerScheduler() {
return Executors.newScheduledThreadPool(0);
}

@Provides
@Singleton
public static FlightRecorderHelper provideFlightRecorderHelper() {
return new FlightRecorderHelper();
}

@Provides
@Singleton
public static TriggerParser provideTriggerParser(FlightRecorderHelper helper) {
return new TriggerParser(helper);
}

@Provides
@Singleton
public static TriggerEvaluator provideTriggerEvaluatorFactory(
@Named(TRIGGER_SCHEDULER) ScheduledExecutorService scheduler,
TriggerParser parser,
FlightRecorderHelper helper,
@Named(ConfigModule.CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS)
long evaluationPeriodMs) {
return new TriggerEvaluator(scheduler, parser, helper, evaluationPeriodMs);
}

@Provides
@Singleton
public static FileSystem provideFileSystem() {
Expand Down
Loading
Loading