Skip to content

Commit

Permalink
Closes #7: added documentation for all metrics recorders (#39)
Browse files Browse the repository at this point in the history
  • Loading branch information
JonasKunz authored Jan 9, 2019
1 parent bf54c71 commit 057810c
Show file tree
Hide file tree
Showing 3 changed files with 308 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,10 @@ Disabling of metrics completly in the agent can be done by setting the `inspecti
This way any default inspectIT setting or anything else defined for metrics collection will be overruled.
If used, the switch makes sure that the inspectIT OCE agent:
. disables all metrics recorders
. does not set up any metrics exporter
. disables all metrics collectors
====

include::metrics-recorders.adoc[]

include::metrics-exporters.adoc[]
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
=== Metrics Exporters

Metrics exporters are responsible for passing the collected metrics to a metric storage.
Metrics exporters are responsible for passing the recorded metrics to a metric storage.
They can implement a push approach where metrics are sent to a collector or a pull approach where metrics are scraped by an external system.

If an exporter supports run-time updates it means that it can be enabled/disabled during the run-time or that any property related to the exporter can be changed.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,304 @@
=== Metrics Recorders

Metrics recorders are responsible for capturing system metrics, such as processor or memory usage. All metrics recorders provided by inspectIT OCE publish the recorded data through the OpenCensus API.
Therefore all recorded metrics can be exported via any of the exported via any of the <<Metrics Exporters,supported metrics exporters>>.
Currently the inspectIT OCE agent is capable of recording the following metrics:

* <<Processor Metrics,CPU>> (usage and number of cores)
* <<Disk Space Metrics,Disk Space>> (used and total)
* <<Memory Metrics,Memory>> (used and availalbe for various regions)
* <<Thread Metrics,Threads>> (counts and states)
* <<Garbage Collection Metrics,Garbage Collection>> (Pause times and collection statistics)
* <<Class Loading Metrics,Class Loading>> (loaded and unloaded counts)
[TIP]
====
The metrics above and their capturing logic are based on the open-source https://micrometer.io/[micrometer] project.
====

In the following sections we provide detailed information on the collected metrics and how they can be configured.
In general metrics are grouped together based on the recorder which provides them.

Many recorders poll the system APIs for extracting the metrics. The rate at which this polling occurs can be configured.
By default all polling based recorders use the duration specified by ```inspectit.metrics.frequency```. The default value
of this property is ```15s```. Overwriting ```inspectit.metrics.frequency``` will cause all recorders to use the given
frequency in case they do not have an explicit frequency in their own configuration.

[IMPORTANT]
.Default metrics settings
====
By default, all metrics are captured if they are available on the system.
If you do not want certain metrics to be recorded, you need to disable them manually.
For example, if you want to disable the ```system.average``` metric of the ```processor``` recorder, you need to use the following configuration:
[source,YAML]
----
inspectit:
metrics:
processor:
enabled:
system.average: false
----
====

==== CPU Metrics

Processor metrics are recorded by the ```inspectit.metrics.processor``` recorder.
This recorder polls the captured data from the system with a frequency specified by ```inspectit.metrics.processor.frequency``` which defaults to ```inspectit.metrics.frequency```.
The available metrics are explained in the table below.

[cols="3,8,2,5%",options="header"]
.CPU metrics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```count```
|The number of processor cores available to the JVM
|cores
|```system/cpu/count```

|```system.average```
|The sum of the number of runnable entities queued to the available processors and the number of runnable
entities running on the available processors averaged over a minute for the whole system.
See the definition of https://docs.oracle.com/javase/7/docs/api/java/lang/management/OperatingSystemMXBean.html#getSystemLoadAverage()[getSystemAverageLoad()] for more details.
|percentage
|```system/load/average/1m```

|```system.usage```
|The recent CPU usage for the whole system
|percentage
|```system/cpu/usage```

|```process.usage```
|The recent CPU usage for the JVM's process
|percentage
|```process/cpu/usage```
|===

[IMPORTANT]
====
The availability of each processor metric depends on the capabilities of your JVM in combination with your OS.
If a metric is not available, the inspectit OCE agent will print a corresponding info in its logs on startup.
====

==== Disk Space Metrics

Disk space metrics are recorded by the ```inspectit.metrics.disk``` recorder.
This recorder polls the captured data from the system with a frequency specified by ```inspectit.metrics.disk.frequency``` which defaults to ```inspectit.metrics.frequency```.
The available metrics are explained in the table below.

[cols="3,8,2,5%",options="header"]
.Disk metrics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```free```
|The free disk space
|bytes
|```disk/free```

|```total```
|The total size of the disk
|bytes
|```disk/total```
|===

==== Memory Metrics

All memory related metrics are recorded by the ```inspectit.metrics.memory``` recorder.
This recorder polls the captured data from the system with a frequency specified by ```inspectit.metrics.memory.frequency``` which defaults to ```inspectit.metrics.frequency```.

The first set of available metrics are general JVM memory metrics:
[cols="3,8,2,5%",options="header"]
.JVM memory metrics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```used```
|The amount of used memory
|bytes
|```jvm/memory/used```

|```committed```
|The amount of memory that is committed for the Java virtual machine to use
|bytes
|```jvm/memory/committed```

|```max```
|The maximum amount of memory in bytes that can be used for memory management
|bytes
|```jvm/memory/max```
|===

For all these metrics inspectIT adds two tags in addition to the <<Common Tags,common tags>>: Firstly ```area``` which either is ```heap``` or ```nonheap```.
Secondly an ```id``` tag is added specifying the exact memory region, for example ```PS Old Gen``` depending on the used garbage collector.

Most JVMs also provide metrics regarding the usage of https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html[buffer] pools, which are reflected in the following metrics provided by inspectIT OCE:

[cols="3,8,2,5%",options="header"]
.JVM buffer pool metrics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```buffer.count```
|An estimate of the number of buffers for each buffer pool
|buffers
|```jvm/buffer/count```

|```buffer.used```
|An estimate of the memory that the JVM is currently using for each buffer pool
|bytes
|```jvm/buffer/memory/used```

|```buffer.capacity```
| An estimate of the total capacity of the buffers in each pool
|bytes
|```jvm/buffer/total/capacity```
|===

Again for each metric an ```id``` tag is added. This tag hereby contains the name of the buffer pool for which the metrics was captured.

==== Thread Metrics

Thread metrics provide statistics about the number and the state of all JVM threads.
They are recorded by the ```inspectit.metrics.threads``` recorder.
This recorder polls the captured data from the JVM with a frequency specified by ```inspectit.metrics.threads.frequency``` which defaults to ```inspectit.metrics.frequency```.
The available thread metrics are explained in the table below.

[cols="3,8,2,5%",options="header"]
.Thread metrics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```peak```
|The peak number of live threads since the start of the JVM
|threads
|```jvm/threads/peak```

|```live```
|The total number of currently live threads including both daemon and non-daemon threads
|threads
|```jvm/threads/live```

|```daemon```
|The total number of currently live daemon threads
|threads
|```jvm/threads/daemon```

|```states```
|The total number of currently live threads for each state
|threads
|```jvm/threads/states```

|===

The ```states``` metric provides the amount of threads grouped by their state.
For this purpose, an additional tag ```state``` is added whose values correspond to the Java https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.State.html[Thread.State enum].

==== Garbage Collection Metrics

The ```inspectit.metrics.gc``` recorder provides metrics about the time spent for garbage collection as well as about the collection effectiveness.
This recorder is not polling based. Instead, it listens to garbage collection events published by the JVM and records metrics on occurrence.

[IMPORTANT]
====
The availability of all garbage collection metrics depends on the capabilities of your JVM.
If the garbage collection metrics are unavailable, the inspectit OCE agent will print a corresponding info in its logs on startup.
====

The recorder offers the following timing related metrics:

[cols="3,8,2,5%",options="header"]
.Garbage Collection Timings
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```pause```
|The total time spent for Garbage Collection Pauses
|milliseconds
|```jvm/gc/pause```

|```concurrent.phase.time```
|The total time spent in concurrent phases of the Garbage Collector
|milliseconds
|```jvm/gc/concurrent/phase/time```

|===

Whether ```pause``` or ```concurrent.phase.time``` are captured depends on the concurrency of the garbage collector with which the JVM was started.
For both metrics an ```action``` and a ```cause``` tag is added. The ```action``` specifies what was was done, e.g. a minor or a major collection.
The ```cause``` tag provides information on the circumstances which triggered the collection.

The following additional garbage collection metrics are also available:

[cols="3,8,2,5%",options="header"]
.Garbage Collection Statistics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```live.data.size```
|The size of the old generation memory pool captured directly after a full GC.
|bytes
|```jvm/gc/live/data/size```

|```max.data.size```
|The maximum allowed size of the old generation memory pool captured directly after a full GC.
|bytes
|```jvm/gc/max/data/size```

|```memory.allocated```
|Increase in the size of the young generation memory pool after one GC to before the next
|bytes
|```jvm/gc/memory/allocation```

|```memory.promoted```
|Increase in the size of the old generation memory pool from before a GC to after the GC
|bytes
|```jvm/gc/memory/allocation```

|===

==== Class Loading Metrics

Class loading metrics are recorded by the ```inspectit.metrics.classloader``` recorder.
This recorder polls the captured data from the system with a frequency specified by ```inspectit.metrics.classloader.frequency``` which defaults to ```inspectit.metrics.frequency```.
The available metrics are explained in the table below.

[cols="3,8,2,5%",options="header"]
.Class loader metrics
|===
|Metric
|Description
|Unit
|OpenCensus Metric Name

|```loaded```
|The total number of currently loaded classes in the JVM
|classes
|```jvm/classes/loaded```

|```unloaded```
|The total number of unloaded classes since the start of the JVM
|classes
|```jvm/classes/unloaded```
|===

0 comments on commit 057810c

Please sign in to comment.