Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics for OS and OSD processes #83

Merged
merged 3 commits into from
Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 24 additions & 7 deletions lib/cloudwatch/metrics-section.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,35 @@ The OpenSearch Contributors require contributions made to
this file be licensed under the Apache-2.0 license or a
compatible open source license. */

import { Unit } from 'aws-cdk-lib/aws-cloudwatch';

type MeasurementDefinition = string | { name: string, rename?: string, unit?: Unit }

interface MetricDefinition {
measurement: string[];
}
resources?: string[],
measurement: MeasurementDefinition[],
// eslint-disable-next-line camelcase
metrics_collection_interval?: number,
}

export interface ProcstatMetricDefinition {
pattern?: string;
// eslint-disable-next-line camelcase
append_dimensions?: string[];
measurement: string[]; // procstat does not support the common measurement standard for rename/unit
// eslint-disable-next-line camelcase
metrics_collection_interval: number;
}

interface EditableCloudwatchMetricsSection {
// eslint-disable-next-line camelcase
metrics_collected: {
cpu: MetricDefinition,
disk: MetricDefinition,
diskio: MetricDefinition,
mem: MetricDefinition,
net: MetricDefinition,
procstat?: ProcstatMetricDefinition[],
cpu?: MetricDefinition,
disk?: MetricDefinition,
diskio?: MetricDefinition,
mem?: MetricDefinition,
net?: MetricDefinition,
};
}

Expand Down
49 changes: 47 additions & 2 deletions lib/infra/infra-stack.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import {
AutoScalingGroup, BlockDeviceVolume, EbsDeviceVolumeType, Signals,
} from 'aws-cdk-lib/aws-autoscaling';
import { Unit } from 'aws-cdk-lib/aws-cloudwatch';
import {
AmazonLinuxCpuType,
AmazonLinuxGeneration,
Expand Down Expand Up @@ -38,6 +39,7 @@
import { dump, load } from 'js-yaml';
import { join } from 'path';
import { CloudwatchAgent } from '../cloudwatch/cloudwatch-agent';
import { ProcstatMetricDefinition } from '../cloudwatch/metrics-section';
import { nodeConfig } from '../opensearch-config/node-config';
import { RemoteStoreResources } from './remote-store-resources';

Expand Down Expand Up @@ -148,7 +150,7 @@
}

if (props.singleNodeCluster) {
console.log('Single node value is true, creating single node configurations');

Check warning on line 153 in lib/infra/infra-stack.ts

View workflow job for this annotation

GitHub Actions / build

Unexpected console statement
singleNodeInstance = new Instance(this, 'single-node-instance', {
vpc: props.vpc,
instanceType: singleNodeInstanceType,
Expand Down Expand Up @@ -375,6 +377,33 @@
private static getCfnInitElement(scope: Stack, logGroup: LogGroup, props: infraProps, nodeType?: string): InitElement[] {
const configFileDir = join(__dirname, '../opensearch-config');
let opensearchConfig: string;
const procstatConfig: ProcstatMetricDefinition[] = [{
pattern: '-Dopensearch',
measurement: [
'cpu_usage',
'cpu_time_system',
'cpu_time_user',
'read_bytes',
'write_bytes',
'pid_count',
],
metrics_collection_interval: 10,
},
];
if (props.dashboardsUrl !== 'undefined') {
procstatConfig.push({
pattern: 'opensearch-dashboards',
measurement: [
'cpu_usage',
'cpu_time_system',
'cpu_time_user',
'read_bytes',
'write_bytes',
'pid_count',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rishabh6788 ,

Not sure if we need all of these metrics. Just pid_count will suffice right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we don't need just the process cpu and other metrics. PID should be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it. Will only give pid_count metric now. Thanks!

],
metrics_collection_interval: 15,
});
}

const cfnInitConfig: InitElement[] = [
InitPackage.yum('amazon-cloudwatch-agent'),
Expand All @@ -388,6 +417,7 @@
},
metrics: {
metrics_collected: {
procstat: procstatConfig,
cpu: {
measurement: [
// eslint-disable-next-line max-len
Expand All @@ -396,7 +426,13 @@
},
disk: {
measurement: [
'free', 'total', 'used', 'used_percent', 'inodes_free', 'inodes_used', 'inodes_total',
{ name: 'free', unit: Unit.PERCENT },
{ name: 'total', unit: Unit.PERCENT },
{ name: 'used', unit: Unit.PERCENT },
{ name: 'used_percent', unit: Unit.PERCENT },
{ name: 'inodes_free', unit: Unit.PERCENT },
{ name: 'inodes_used', unit: Unit.PERCENT },
{ name: 'inodes_total', unit: Unit.PERCENT },
],
},
diskio: {
Expand All @@ -406,7 +442,16 @@
},
mem: {
measurement: [
'active', 'available', 'available_percent', 'buffered', 'cached', 'free', 'inactive', 'total', 'used', 'used_percent',
{ name: 'active', unit: Unit.PERCENT },
{ name: 'available', unit: Unit.PERCENT },
{ name: 'available_percent', unit: Unit.PERCENT },
{ name: 'buffered', unit: Unit.PERCENT },
{ name: 'cached', unit: Unit.PERCENT },
{ name: 'free', unit: Unit.PERCENT },
{ name: 'inactive', unit: Unit.PERCENT },
{ name: 'total', unit: Unit.PERCENT },
{ name: 'used', unit: Unit.PERCENT },
{ name: 'used_percent', unit: Unit.PERCENT },
],
},
net: {
Expand Down
Loading