Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: write open telemetry semantic conventions for AI metrics and span attributes #1601

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
6 changes: 6 additions & 0 deletions js/ai/src/generate/response.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import {
} from '../generate.js';
import { Message, MessageParser } from '../message.js';
import {
GenerateClientTelemetry,
GenerateRequest,
GenerateResponseData,
GenerationUsage,
Expand Down Expand Up @@ -48,6 +49,8 @@ export class GenerateResponse<O = unknown> implements ModelResponseData {
request?: GenerateRequest;
/** The parser for output parsing of this response. */
parser?: MessageParser<O>;
/** Client telemetry information associated with this request */
clientTelemetry?: GenerateClientTelemetry;

constructor(
response: GenerateResponseData,
Expand All @@ -71,6 +74,7 @@ export class GenerateResponse<O = unknown> implements ModelResponseData {
this.usage = response.usage || {};
this.custom = response.custom || {};
this.request = options?.request;
this.clientTelemetry = response.clientTelemetry;
}

private get assertMessage(): Message<O> {
Expand Down Expand Up @@ -196,9 +200,11 @@ export class GenerateResponse<O = unknown> implements ModelResponseData {
usage: this.usage,
custom: (this.custom as { toJSON?: () => any }).toJSON?.() || this.custom,
request: this.request,
clientTelemetry: this.clientTelemetry,
};
if (!out.finishMessage) delete out.finishMessage;
if (!out.request) delete out.request;
if (!out.clientTelemetry) delete out.clientTelemetry;
return out;
}
}
42 changes: 35 additions & 7 deletions js/ai/src/model.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,11 @@ import {
} from '@genkit-ai/core';
import { Registry } from '@genkit-ai/core/registry';
import { toJsonSchema } from '@genkit-ai/core/schema';
import { SpanMetadata, spanMetadataAlsKey } from '@genkit-ai/core/tracing';
import { performance } from 'node:perf_hooks';
import { DocumentDataSchema } from './document.js';
import { augmentWithContext, validateSupport } from './model/middleware.js';
import { writeSemConvTelemetry } from './telemetry.js';

//
// IMPORTANT: Please keep type definitions in sync with
Expand Down Expand Up @@ -362,6 +364,26 @@ export const CandidateErrorSchema = z.object({
/** @deprecated All responses now return a single candidate. Only the first candidate will be used if supplied. */
export type CandidateError = z.infer<typeof CandidateErrorSchema>;

/**
* Additional telemetry information associated with a Generate request.
* See
* https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#generative-ai-client-metrics
* for details.
*/
export const GenerateClientTelemetrySchema = z.object({
system: z.string().optional(),
requestModel: z.string().optional(),
responseModel: z.string().optional(),
operationName: z.string().optional(),
serverPort: z.number().optional(),
serverAddress: z.string().optional(),
encodingFormats: z.array(z.string()).optional(),
responseId: z.string().optional(),
});
export type GenerateClientTelemetry = z.infer<
typeof GenerateClientTelemetrySchema
>;

/**
* Zod schema of a model response.
*/
Expand All @@ -375,6 +397,7 @@ export const ModelResponseSchema = z.object({
custom: z.unknown(),
raw: z.unknown(),
request: GenerateRequestSchema.optional(),
clientTelemetry: GenerateClientTelemetrySchema.optional(),
});

/**
Expand Down Expand Up @@ -491,13 +514,18 @@ export function defineModel<
(input) => {
const startTimeMs = performance.now();

return runner(input, getStreamingCallback(registry)).then((response) => {
const timedResponse = {
...response,
latencyMs: performance.now() - startTimeMs,
};
return timedResponse;
});
return runner(input, getStreamingCallback(registry)).then(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just use the await syntax here, since we're introducing it below (await writeSemConv...) so that we're consistent?

async (response) => {
const timedResponse = {
...response,
latencyMs: performance.now() - startTimeMs,
};
const span =
registry.asyncStore.getStore<SpanMetadata>(spanMetadataAlsKey);
await writeSemConvTelemetry(timedResponse, span);
return timedResponse;
}
);
}
);
Object.assign(act, {
Expand Down
120 changes: 120 additions & 0 deletions js/ai/src/telemetry.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
/**
* Copyright 2024 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import { MetricHistogram } from '@genkit-ai/core';
import { SpanMetadata, getTelemetryConfig } from '@genkit-ai/core/tracing';
import { AttributeValue, ValueType } from '@opentelemetry/api';
import { GenerateResponseData } from './model.js';

/**
* This metric is defined by the Generative AI semantic convention:
* https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage
*/
const tokenUsage = new MetricHistogram('gen_ai.client.token.usage', {
description: 'Usage of GenAI tokens.',
valueType: ValueType.INT,
unit: 'token',
});

/**
* This metric is defined by the Generative AI semantic convention:
* https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclientoperationduration
*/
const operationDuration = new MetricHistogram(
'gen_ai.client.operation.duration',
{
description: 'Time taken for GenAI operations',
valueType: ValueType.DOUBLE,
unit: 'token',
}
);

export async function writeSemConvTelemetry(
output: GenerateResponseData,
span?: SpanMetadata
): Promise<void> {
const telemetryConfig = await getTelemetryConfig();

if (telemetryConfig?.semConv?.writeMetrics) {
writeMetrics(output);
}
if (span && telemetryConfig?.semConv?.writeSpanAttributes) {
writeSpanAttributes(output, span);
}
}

function writeMetrics(resp: GenerateResponseData): void {
const commonDimensions = {
'gen_ai.client.framework': 'genkit',
'gen_ai.operation.name': resp.clientTelemetry?.operationName,
'gen_ai.system': resp.clientTelemetry?.system,
'gen_ai.request.model': resp.clientTelemetry?.requestModel,
'server.port': resp.clientTelemetry?.serverPort,
'gen_ai.response.model': resp.clientTelemetry?.responseModel,
'server.address': resp.clientTelemetry?.serverAddress,
};
tokenUsage.record(resp.usage?.inputTokens || 0, {
...commonDimensions,
'gen_ai.token.type': 'input',
});
tokenUsage.record(resp.usage?.outputTokens || 0, {
...commonDimensions,
'gen_ai.token.type': 'output',
});
if (resp.latencyMs) {
operationDuration.record(resp.latencyMs, commonDimensions);
}
}

function writeSpanAttributes(
output: GenerateResponseData,
span: SpanMetadata
): void {
const t: Record<string, AttributeValue> = {};
const client = output.clientTelemetry;
const config = output.request?.config;
const usage = output.usage;
setAttribute(t, 'gen_ai.client.framework', 'genkit');
setAttribute(t, 'gen_ai.operation.name', client?.operationName);
setAttribute(t, 'gen_ai.system', client?.system);
setAttribute(t, 'gen_ai.request.model', client?.requestModel);
setAttribute(t, 'server.port', client?.serverPort);
setAttribute(t, 'gen_ai.request.encoding_formats', client?.encodingFormats);
setAttribute(t, 'gen_ai.request.frequency_penalty', config?.frequencyPenalty);
setAttribute(t, 'gen_ai.request.max_tokens', config?.maxOutputTokens);
setAttribute(t, 'gen_ai.request.presence_penalty', config?.presencePenalty);
setAttribute(t, 'gen_ai.request.stop_sequences', config?.stopSequences);
setAttribute(t, 'gen_ai.request.temperature', config?.temperature);
setAttribute(t, 'gen_ai.request.top_k', config?.topK);
setAttribute(t, 'gen_ai.request.top_p', config?.topP);
setAttribute(t, 'gen_ai.response.finish_reasons', [output.finishReason]);
setAttribute(t, 'gen_ai.response.id', client?.responseId);
setAttribute(t, 'gen_ai.response.model', client?.responseModel);
setAttribute(t, 'gen_ai.usage.input_tokens', usage?.inputTokens);
setAttribute(t, 'gen_ai.usage.output_tokens', usage?.outputTokens);
setAttribute(t, 'server.address', client?.serverAddress);
span.telemetry = t;
}

function setAttribute(
attrs: Record<string, AttributeValue>,
key: string,
attribute?: AttributeValue
) {
if (attribute) {
attrs[key] = attribute!;
}
}
1 change: 1 addition & 0 deletions js/core/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ export {
type FlowConfig,
type FlowFn,
} from './flow.js';
export * from './metrics.js';
export * from './plugin.js';
export * from './reflection.js';
export { defineJsonSchema, defineSchema, type JSONSchema } from './schema.js';
Expand Down
105 changes: 105 additions & 0 deletions js/core/src/metrics.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
/**
* Copyright 2024 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import { Counter, Histogram, Meter, metrics } from '@opentelemetry/api';

type MetricCreateFn<T> = (meter: Meter) => T;
export const METER_NAME = 'genkit';

/**
* Wrapper for OpenTelemetry metrics.
*
* The OpenTelemetry {MeterProvider} can only be accessed through the metrics
* API after the NodeSDK library has been initialized. To prevent race
* conditions we defer the instantiation of the metric to when it is first
* ticked.
*
* Note: This metric should only be used for writing metrics adhering to the
* OpenTelemetry Generative AI Semantic Conventions. Any other metric should
* be written by an appropriate plugin, eg. google-cloud plugin.
*
* https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/
*/
export class Metric<T> {
readonly createFn: MetricCreateFn<T>;
readonly meterName: string;
metric?: T;

constructor(createFn: MetricCreateFn<T>, meterName: string = METER_NAME) {
this.meterName = meterName;
this.createFn = createFn;
}

get(): T {
if (!this.metric) {
this.metric = this.createFn(
metrics.getMeterProvider().getMeter(this.meterName)
);
}

return this.metric;
}
}

/**
* Wrapper for an OpenTelemetry Counter.
*
* By using this wrapper, we defer initialization of the counter until it is
* need, which ensures that the OpenTelemetry SDK has been initialized before
* the metric has been defined.
*
* Note: This counter should only be used for writing metrics adhering to the
* OpenTelemetry Generative AI Semantic Conventions. Any other metric should
* be written by an appropriate plugin, eg. google-cloud plugin.
*
* https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/
*/
export class MetricCounter extends Metric<Counter> {
constructor(name: string, options: any) {
super((meter) => meter.createCounter(name, options));
}

add(val?: number, opts?: any) {
if (val) {
this.get().add(val, opts);
}
}
}

/**
* Wrapper for an OpenTelemetry Histogram.
*
* By using this wrapper, we defer initialization of the counter until it is
* need, which ensures that the OpenTelemetry SDK has been initialized before
* the metric has been defined.
*
* Note: This histogram should only be used for writing metrics adhering to the
* OpenTelemetry Generative AI Semantic Conventions. Any other metric should
* be written by an appropriate plugin, eg. google-cloud plugin.
*
* https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/
*/
export class MetricHistogram extends Metric<Histogram> {
constructor(name: string, options: any) {
super((meter) => meter.createHistogram(name, options));
}

record(val?: number, opts?: any) {
if (val) {
this.get().record(val, opts);
}
}
}
22 changes: 19 additions & 3 deletions js/core/src/telemetryTypes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,25 @@

import { NodeSDKConfiguration } from '@opentelemetry/sdk-node';

/**
* Options governing whether Genkit will write telemetry data following the
* OpenTelemetry Semantic Conventions for Generative AI systems:
* https://opentelemetry.io/docs/specs/semconv/gen-ai/
*/
export interface SemConvOptions {
writeMetrics: boolean;
writeSpanAttributes: boolean;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for turning this on and off at this level?

I am wondering if we should always write metrics, and then let the client/exporter control whether or not the metric actually gets sent anywhere. I think this is how could work? But correct me if I am off base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple other thoughts...

  1. Semantic Convention is a generic term, there are many conventions for many things (e.g. HTTP) so if this is a flag we want, it should be specific as to what convention it refers to.
  2. I imagine we should have a generic way of allow-listing, deny-listing metrics that are exported, versus a metric specific flag.


/** Global options governing how Genkit will write telemetry data. */
export interface TelemetryOptions {
semConv?: SemConvOptions;
}

/**
* Provides a {NodeSDKConfiguration} configuration for use with the
* Open-Telemetry SDK. This configuration allows plugins to specify how and
* where open telemetry data will be exported.
* Open-Telemetry SDK and other configuration options. This configuration
* allows plugins to specify how and where open telemetry data will be
* exported.
*/
export type TelemetryConfig = Partial<NodeSDKConfiguration>;
export type TelemetryConfig = Partial<NodeSDKConfiguration> & TelemetryOptions;
8 changes: 8 additions & 0 deletions js/core/src/tracing.ts
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,14 @@ export async function ensureBasicTelemetryInstrumentation() {
await enableTelemetry({});
}

/**
* Gets the global telemetry configuration object.
*/
export async function getTelemetryConfig(): Promise<TelemetryConfig> {
await ensureBasicTelemetryInstrumentation();
return await global[instrumentationKey];
}

/**
* Enables tracing and metrics open telemetry configuration.
*/
Expand Down
Loading
Loading