Skip to content

Commit

Permalink
Refactor PromMetrics to use Collect functions (#3607)
Browse files Browse the repository at this point in the history
This PR makes the following changes:
- Refactor `addMetric` into individual functions for counter, gauge, and
histogram.
- Add `collect()` parameter to all `add<metric>` functions.
- Update types: 
    - import `prom-client` types into `types` package
- remove duplicate types from `terafoundation` and `job-components` and
import from `types`. This required some refactoring to not use context
as a parameter, as the `context` type differs between `terafoundation`
and `job-components`.
- Move `info` metrics of slice, worker, and master into the
`setPromMetrics()` functions.
 - Update PromClient docs and all relevant tests
  • Loading branch information
busma13 authored May 9, 2024
1 parent db91522 commit 1574226
Show file tree
Hide file tree
Showing 42 changed files with 968 additions and 811 deletions.
56 changes: 47 additions & 9 deletions docs/development/k8s.md
Original file line number Diff line number Diff line change
Expand Up @@ -349,17 +349,39 @@ yarn run ts-scripts k8s-env --rebuild --skip-build --reset-store

## Prometheus Metrics API

The `PromMetrics` class lives within `packages/terafoundation/src/api/prom-metrics` package. Use of its API can be enabled using `prom_metrics_enabled` in the terafoundation config and overwritten in the job config. The `init` function can be found at `context.apis.foundation.promMetrics.init`. It is called on startup of the Teraslice master, execution_Controller, and worker, but only creates the API if `prom_metrics_enabled` is true.
The `PromMetrics` class lives within `packages/terafoundation/src/api/prom-metrics` package. Use of its API can be enabled using `prom_metrics_enabled` in the terafoundation config and overwritten in the job config. The `init` function can be found at `context.apis.foundation.promMetrics.init`. It is called on startup of the Teraslice master, execution_controller, and worker, but only creates the API if `prom_metrics_enabled` is true.

### Functions


| Name | Description | Type |
| ---------------- | --------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- |
| init | initialize the API and create exporter server | (config: PromMetricsInitConfig) => Promise<boolean> |
| set | set the value of a gauge | (name: string, labels: Record<string, string>, value: number) => void |
| inc | increment the value of a counter or gauge | (name: string, labelValues: Record<string, string>, value: number) => void |
| dec | decrement the value of a gauge | (name: string, labelValues: Record<string, string>, value: number) => void |
| observe | observe a histogram or summary | (name: string, labelValues: Record<string, string>, value: number) => void |
| addGauge | add a gauge metric | (name: string, help: string, labelNames: Array<string\>, collectFn?: CollectFunction<Gauge>) => Promise<void\> |
| addCounter | add a counter metric | (name: string, help: string, labelNames: Array<string\>, collectFn?: CollectFunction<Counter>) => Promise<void\> |
| addHistogram | add a histogram metric | (name: string, help: string, labelNames: Array<string\>, collectFn?: CollectFunction<Histogram>, buckets?: Array<number>) => Promise<void\> |
| addSummary | add a summary metric | (name: string, help: string, labelNames: Array<string\>, collectFn?: CollectFunction<Summary\>, maxAgeSeconds?: number, ageBuckets?: number, percentiles?: Array<number>) => Promise<void\> |
| hasMetric | check if a metric exists | (name: string) => boolean |
| deleteMetric | delete a metric from the metric list | (name: string) => Promise<boolean\> |
| verifyAPI | verfiy that the API is running | () => boolean |
| shutdown | disable API and shutdown exporter server | () => Promise<void\> |
| getDefaultLabels | retrieve the default labels set at init | () => Record<string, string> |

Example init:
```typescript
await config.context.apis.foundation.promMetrics.init({
context: config.context,
logger: this.logger,
metrics_enabled_by_job: config.executionConfig.prom_metrics_enabled, // optional job override
assignment: 'execution_controller',
port: config.executionConfig.prom_metrics_port, // optional job override
default_metrics: config.executionConfig.prom_metrics_add_default, // optional job override
logger: this.logger,
tf_prom_metrics_add_default: terafoundation.prom_metrics_add_default,
tf_prom_metrics_enabled: terafoundation.prom_metrics_enabled,
tf_prom_metrics_port: terafoundation.prom_metrics_port,
job_prom_metrics_add_default: config.executionConfig.prom_metrics_add_default, // optional job override
job_prom_metrics_enabled: config.executionConfig.prom_metrics_enabled, // optional job override
job_prom_metrics_port: config.executionConfig.prom_metrics_port, // optional job override
labels: { // optional default labels on all metrics for this teraslice process
ex_id: this.exId,
job_id: this.jobId,
Expand All @@ -370,13 +392,12 @@ await config.context.apis.foundation.promMetrics.init({

Once initialized all of the other functions under `context.apis.foundation.promMetrics` will be enabled. It's important to note that the foundation level wrapper functions allow all of the prom metrics functions to be called even if metrics are disabled or the API hasn't been initialized. There is no need to make checks at the level where a function is called, and failures will never throw errors.

Example addMetric:
Example Counter:
```typescript
await this.context.apis.foundation.promMetrics.addMetric(
await this.context.apis.foundation.promMetrics.addCounter(
'slices_dispatched', // name
'number of slices a slicer has dispatched', // help or description
['class'], // label names specific to this metric
'counter'); // metric type

// now we can increment the counter anywhere else in the code
this.context.apis.foundation.promMetrics.inc(
Expand All @@ -386,6 +407,23 @@ this.context.apis.foundation.promMetrics.inc(
);
```
Example Gauge using collect() callback:
```typescript
const self = this; // rename `this` to use inside collect()
await this.context.apis.foundation.promMetrics.addGauge(
'slices_dispatched', // name
'number of slices a slicer has dispatched', // help or description
['class'], // label names specific to this metric
function collect() { // callback fn updates value only when '/metrics' endpoint is hit
const slicesFinished = self.getSlicesDispatched(); // get current value from local momory
const labels = { // 'set()' needs both default labels and labels specific to metric to match the correct gauge
...self.context.apis.foundation.promMetrics.getDefaultLabels(),
class: 'SlicerExecutionContext'
};
this.set(labels, slicesFinished); // this refers to the Gauge
}
```
The label names as well as the metric name must match when using `inc`, `dec`, `set`, or `observe` to modify a metric.
## Extras
Expand Down
4 changes: 2 additions & 2 deletions e2e/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,9 @@
"ms": "^2.1.3"
},
"devDependencies": {
"@terascope/types": "^0.16.0",
"@terascope/types": "^0.17.0",
"bunyan": "^1.8.15",
"elasticsearch-store": "^0.83.1",
"elasticsearch-store": "^0.84.0",
"fs-extra": "^11.2.0",
"ms": "^2.1.3",
"nanoid": "^3.3.4",
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "teraslice-workspace",
"displayName": "Teraslice",
"version": "1.4.1",
"version": "1.5.0",
"private": true,
"homepage": "https://github.com/terascope/teraslice",
"bugs": {
Expand Down
10 changes: 5 additions & 5 deletions packages/data-mate/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@terascope/data-mate",
"displayName": "Data-Mate",
"version": "0.55.1",
"version": "0.56.0",
"description": "Library of data validations/transformations",
"homepage": "https://github.com/terascope/teraslice/tree/master/packages/data-mate#readme",
"repository": {
Expand Down Expand Up @@ -29,9 +29,9 @@
"test:watch": "ts-scripts test --watch . --"
},
"dependencies": {
"@terascope/data-types": "^0.49.1",
"@terascope/types": "^0.16.0",
"@terascope/utils": "^0.58.1",
"@terascope/data-types": "^0.50.0",
"@terascope/types": "^0.17.0",
"@terascope/utils": "^0.59.0",
"@types/validator": "^13.11.9",
"awesome-phonenumber": "^2.70.0",
"date-fns": "^2.30.0",
Expand All @@ -46,7 +46,7 @@
"uuid": "^9.0.1",
"valid-url": "^1.0.9",
"validator": "^13.11.0",
"xlucene-parser": "^0.57.1"
"xlucene-parser": "^0.58.0"
},
"devDependencies": {
"@types/ip6addr": "^0.2.6",
Expand Down
6 changes: 3 additions & 3 deletions packages/data-types/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@terascope/data-types",
"displayName": "Data Types",
"version": "0.49.1",
"version": "0.50.0",
"description": "A library for defining the data structures and mapping",
"homepage": "https://github.com/terascope/teraslice/tree/master/packages/data-types#readme",
"bugs": {
Expand All @@ -26,8 +26,8 @@
"test:watch": "ts-scripts test --watch . --"
},
"dependencies": {
"@terascope/types": "^0.16.0",
"@terascope/utils": "^0.58.1",
"@terascope/types": "^0.17.0",
"@terascope/utils": "^0.59.0",
"graphql": "^14.7.0",
"lodash": "^4.17.21",
"yargs": "^17.7.2"
Expand Down
8 changes: 4 additions & 4 deletions packages/elasticsearch-api/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@terascope/elasticsearch-api",
"displayName": "Elasticsearch API",
"version": "3.19.1",
"version": "3.20.0",
"description": "Elasticsearch client api used across multiple services, handles retries and exponential backoff",
"homepage": "https://github.com/terascope/teraslice/tree/master/packages/elasticsearch-api#readme",
"bugs": {
Expand All @@ -23,16 +23,16 @@
"test:watch": "TEST_RESTRAINED_ELASTICSEARCH='true' ts-scripts test --watch . --"
},
"dependencies": {
"@terascope/types": "^0.16.0",
"@terascope/utils": "^0.58.1",
"@terascope/types": "^0.17.0",
"@terascope/utils": "^0.59.0",
"bluebird": "^3.7.2",
"setimmediate": "^1.0.5"
},
"devDependencies": {
"@opensearch-project/opensearch": "^1.2.0",
"@types/elasticsearch": "^5.0.43",
"elasticsearch": "^15.4.1",
"elasticsearch-store": "^0.83.1",
"elasticsearch-store": "^0.84.0",
"elasticsearch6": "npm:@elastic/elasticsearch@^6.7.0",
"elasticsearch7": "npm:@elastic/elasticsearch@^7.0.0",
"elasticsearch8": "npm:@elastic/elasticsearch@^8.0.0"
Expand Down
12 changes: 6 additions & 6 deletions packages/elasticsearch-store/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "elasticsearch-store",
"displayName": "Elasticsearch Store",
"version": "0.83.1",
"version": "0.84.0",
"description": "An API for managing an elasticsearch index, with versioning and migration support.",
"homepage": "https://github.com/terascope/teraslice/tree/master/packages/elasticsearch-store#readme",
"bugs": {
Expand Down Expand Up @@ -29,10 +29,10 @@
"test:watch": "ts-scripts test --watch . --"
},
"dependencies": {
"@terascope/data-mate": "^0.55.1",
"@terascope/data-types": "^0.49.1",
"@terascope/types": "^0.16.0",
"@terascope/utils": "^0.58.1",
"@terascope/data-mate": "^0.56.0",
"@terascope/data-types": "^0.50.0",
"@terascope/types": "^0.17.0",
"@terascope/utils": "^0.59.0",
"ajv": "^6.12.6",
"elasticsearch6": "npm:@elastic/elasticsearch@^6.7.0",
"elasticsearch7": "npm:@elastic/elasticsearch@^7.0.0",
Expand All @@ -41,7 +41,7 @@
"opensearch2": "npm:@opensearch-project/opensearch@^2.2.1",
"setimmediate": "^1.0.5",
"uuid": "^9.0.1",
"xlucene-translator": "^0.43.1"
"xlucene-translator": "^0.44.0"
},
"devDependencies": {
"@types/uuid": "^9.0.8"
Expand Down
4 changes: 2 additions & 2 deletions packages/generator-teraslice/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "generator-teraslice",
"displayName": "Generator Teraslice",
"version": "0.37.1",
"version": "0.38.0",
"description": "Generate teraslice related packages and code",
"keywords": [
"teraslice",
Expand All @@ -24,7 +24,7 @@
"test:watch": "ts-scripts test --watch . --"
},
"dependencies": {
"@terascope/utils": "^0.58.1",
"@terascope/utils": "^0.59.0",
"chalk": "^4.1.2",
"lodash": "^4.17.21",
"yeoman-generator": "^5.8.0",
Expand Down
4 changes: 2 additions & 2 deletions packages/job-components/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@terascope/job-components",
"displayName": "Job Components",
"version": "0.73.2",
"version": "0.74.0",
"description": "A teraslice library for validating jobs schemas, registering apis, and defining and running new Job APIs",
"homepage": "https://github.com/terascope/teraslice/tree/master/packages/job-components#readme",
"bugs": {
Expand Down Expand Up @@ -31,7 +31,7 @@
"test:watch": "ts-scripts test --watch . --"
},
"dependencies": {
"@terascope/utils": "^0.58.1",
"@terascope/utils": "^0.59.0",
"convict": "^6.2.4",
"convict-format-with-moment": "^6.2.0",
"convict-format-with-validator": "^6.2.0",
Expand Down
57 changes: 0 additions & 57 deletions packages/job-components/src/execution-context/slicer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -49,48 +49,13 @@ export class SlicerExecutionContext
this.addOperation(op);

this._resetMethodRegistry();

(async () => {
await config.context.apis.foundation.promMetrics.init({
context: config.context,
logger: this.logger,
metrics_enabled_by_job: config.executionConfig.prom_metrics_enabled,
assignment: 'execution_controller',
port: config.executionConfig.prom_metrics_port,
default_metrics: config.executionConfig.prom_metrics_add_default,
labels: {
ex_id: this.exId,
job_id: this.jobId,
job_name: this.config.name,
}
});
})();
}

/**
* Called during execution initialization
* @param recoveryData is the data to recover from
*/
async initialize(recoveryData?: SlicerRecoveryData[]): Promise<void> {
await this.setupPromMetrics();
await this.context.apis.foundation.promMetrics.addMetric(
'info',
'Information about Teraslice execution controller',
['arch', 'clustering_type', 'name', 'node_version', 'platform', 'teraslice_version'],
'gauge'
);
this.context.apis.foundation.promMetrics.set(
'info',
{
arch: this.context.arch,
clustering_type: this.context.sysconfig.teraslice.cluster_manager_type,
name: this.context.sysconfig.teraslice.name,
node_version: process.version,
platform: this.context.platform,
teraslice_version: this.config.teraslice_version
},
1
);
return super.initialize(recoveryData);
}

Expand Down Expand Up @@ -118,26 +83,4 @@ export class SlicerExecutionContext
onSliceComplete(result: SliceResult): void {
this._runMethod('onSliceComplete', result);
}

/**
* Adds all prom metrics specific to the execution_controller.
*
* If trying to add a new metric for the execution_controller, it belongs here.
* @async
* @function setupPromMetrics
* @return {Promise<void>}
* @link https://terascope.github.io/teraslice/docs/development/k8s#prometheus-metrics-api
*/
async setupPromMetrics() {
this.logger.info(`adding ${this.context.assignment} prom metrics...`);
await Promise.all([
// All metrics go inside here
// this.context.apis.foundation.promMetrics.addMetric(
// 'example_metric',
// 'This is an example of adding a metric',
// ['example_label_1', 'Example_label_2'],
// 'gauge'
// )
]);
}
}
Loading

0 comments on commit 1574226

Please sign in to comment.