diff --git a/docs/gen-ai/llm-spans.md b/docs/gen-ai/llm-spans.md index bf831a4c9f..6ad95c08c9 100644 --- a/docs/gen-ai/llm-spans.md +++ b/docs/gen-ai/llm-spans.md @@ -7,6 +7,7 @@ linkTitle: LLM requests **Status**: [Experimental][DocumentStatus] + - [Configuration](#configuration) - [LLM Request attributes](#llm-request-attributes) @@ -20,6 +21,8 @@ linkTitle: LLM requests - [Chat response](#chat-response) - [`Message` object](#message-object) - [Examples](#examples) + - [Chat completion](#chat-completion) + - [Tools](#tools) @@ -46,7 +49,7 @@ This is for three primary reasons: These attributes track input data and metadata for a request to an LLM. Each attribute represents a concept that is common to most LLMs. - + | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| | [`gen_ai.request.model`](../attributes-registry/llm.md) | string | The name of the LLM a request is being made to. [1] | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | @@ -54,8 +57,9 @@ These attributes track input data and metadata for a request to an LLM. Each att | [`gen_ai.request.max_tokens`](../attributes-registry/llm.md) | int | The maximum number of tokens the LLM generates for a request. | `100` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.temperature`](../attributes-registry/llm.md) | double | The temperature setting for the LLM request. | `0.0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.top_p`](../attributes-registry/llm.md) | double | The top_p sampling setting for the LLM request. | `1.0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.response.finish_reason`](../attributes-registry/llm.md) | string | The reason the model stopped generating tokens. [3] | `stop`; `content_filter`; `tool_calls` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.response.id`](../attributes-registry/llm.md) | string | The unique identifier for the completion. | `chatcmpl-123` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.response.model`](../attributes-registry/llm.md) | string | The name of the LLM a response was generated from. [3] | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.response.model`](../attributes-registry/llm.md) | string | The name of the LLM a response was generated from. [4] | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.usage.completion_tokens`](../attributes-registry/llm.md) | int | The number of tokens used in the LLM response (completion). | `180` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.usage.prompt_tokens`](../attributes-registry/llm.md) | int | The number of tokens used in the LLM prompt. | `100` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | @@ -63,7 +67,15 @@ These attributes track input data and metadata for a request to an LLM. Each att **[2]:** If not using a vendor-supplied model, provide a custom friendly name, such as a name of the company or project. If the instrumetnation reports any attributes specific to a custom model, the value provided in the `gen_ai.system` SHOULD match the custom attribute namespace segment. For example, if `gen_ai.system` is set to `the_best_llm`, custom attributes should be added in the `gen_ai.the_best_llm.*` namespace. If none of above options apply, the instrumentation should set `_OTHER`. -**[3]:** If available. The name of the LLM serving a response. If the LLM is supplied by a vendor, then the value must be the exact name of the model actually used. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. +**[3]:** If there is more than one finish reason in the response, the last one should be reported. + +**[4]:** If available. The name of the LLM serving a response. If the LLM is supplied by a vendor, then the value must be the exact name of the model actually used. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | ## Events @@ -76,7 +88,7 @@ It's RECOMMENDED to use [Event API](https://github.com/open-telemetry/openteleme If, however Event API is not supported yet, events SHOULD be recorded as span events. -TODO: There will be a standard mapping between Span Events and Events - https://github.com/open-telemetry/semantic-conventions/pull/954. Add link once merged. +TODO: There [will be](https://github.com/open-telemetry/semantic-conventions/pull/954) a standard mapping between Span Events and Events. Add link once merged. The event payload describes message content sent to or received from GenAI and depends on specific messages described in the following sections. @@ -88,8 +100,18 @@ Telemetry consumers SHOULD expect to receive unknown payload fields. This event describes the instructions passed to the GenAI model. - + The event name MUST be `gen_ai.system.message`. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`gen_ai.system`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor. | `openai` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | Body Field | Type | Description | Examples | Requirement Level | Sensitive | @@ -101,8 +123,18 @@ The event name MUST be `gen_ai.system.message`. This event describes the prompt message specified by the user. - + The event name MUST be `gen_ai.user.message`. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`gen_ai.system`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor. | `openai` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | Body Field | Type | Description | Examples | Requirement Level | Sensitive | @@ -114,15 +146,25 @@ The event name MUST be `gen_ai.user.message`. This event describes the assistant message. - + The event name MUST be `gen_ai.assistant.message`. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`gen_ai.system`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor. | `openai` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | Body Field | Type | Description | Examples | Requirement Level | Sensitive | |---|---|---|---|---|---| | `role` | string | The role of the messages author | `assistant` | `Required` | | | `content` | `AnyValue` | The contents of the assistant message. | `Spans, events, metrics defined by the GenAI semantic conventions.` | `Opt-In` | ![Sensitive](https://img.shields.io/badge/-sensitive-red) | -| `tool_calls` | [ToolCall](#toolcall-object)[] | The tool calls generated by the model, such as function calls. | `[{"id":"call_mszuSIzqtI65i1wAUOE8w5H4", "function":{"name":"get_link_to_otel_semconv", "arguments":"{\"type\":\"gen_ai\"}"}, "type":"function"}]` | `Conditionally Required: if available` | ![Mixed](https://img.shields.io/badge/-mixed-orange) | +| `tool_calls` | [ToolCall](#toolcall-object)[] | The tool calls generated by the model, such as function calls. | `[{"id":"call_mszuSIzqtI65i1wAUOE8w5H4", "function":{"name":"get_link_to_otel_semconv", "arguments":"{\"semconv\":\"gen_ai\"}"}, "type":"function"}]` | `Conditionally Required: if available` | ![Mixed](https://img.shields.io/badge/-mixed-orange) | #### `ToolCall` object @@ -130,21 +172,31 @@ The event name MUST be `gen_ai.assistant.message`. |---|---|---|---|---|---| | `id` | string | The id of the tool call | `call_mszuSIzqtI65i1wAUOE8w5H4` | `Required` | | | | `type` | string | The type of the tool | `function` | `Required` | | -| `function` | [Function](#function) | Function name and arguments | `Required` | ![Mixed](https://img.shields.io/badge/-mixed-orange) | +| `function` | [Function](#function-object) | Function name and arguments | `Required` | ![Mixed](https://img.shields.io/badge/-mixed-orange) | #### `Function` object | Field | Type | Description | Examples | Requirement Level | Sensitive | |---|---|---|---|---|---| | `name` | string | The name of the function to call | `get_link_to_otel_semconv` | `Required` | | -| `arguments` | `AnyValue` | The arguments to pass the the function | `{"gen_ai_system": "OpenAI"}` | `Opt-In` | ![Sensitive](https://img.shields.io/badge/-sensitive-red) | +| `arguments` | `AnyValue` | The arguments to pass the the function | `{"semconv": "gen_ai"}` | `Opt-In` | ![Sensitive](https://img.shields.io/badge/-sensitive-red) | ### Tool message This event describes the output of the tool or function submitted back to the model. - + The event name MUST be `gen_ai.tool.message`. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`gen_ai.system`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor. | `openai` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | Body Field | Type | Description | Examples | Requirement Level | Sensitive | @@ -159,12 +211,18 @@ This event describes the model-generated chat response message (choice). If GenAI model returns multiple choices, each of the message SHOULD be recorded as an individual event. - -The event name MUST be `gen_ai.choice.message`. + +The event name MUST be `gen_ai.choice`. | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.response.finish_reason`](../attributes-registry/llm.md) | string | The reason the model stopped generating tokens. | `stop`; `content_filter`; `tool_calls` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.system`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor. | `openai` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | When response is streamed, instrumentations that report response events MUST reconstruct and report the full message and MUST NOT report individual chunks as events. @@ -189,37 +247,149 @@ The message structure matches one of the messages defined in this document depen ## Examples +### Chat completion + +This example covers the following scenario: + +- user requests chat completion from OpenAI GPT-4 model for the following prompt: + - System message: `You're a friendly bot that answers questions about OpenTelemetry.` + - User message: `How to instrument GenAI library with OTel?` +- The model responds with `"Follow GenAI semantic conventions available at opentelemetry.io."` message + +Span: + +| Attribute name | Value | +|---------------------|-------------------------------------------------------| +| Span name | `"chat.completion gpt-4"` | +| `gen_ai.system` | `"openai"` | +| `gen_ai.request.model`| `"gpt-4"` | +| `gen_ai.request.max_tokens`| `200` | +| `gen_ai.request.top_p`| `1.0` | +| `gen_ai.response.id`| `"chatcmpl-9J3uIL87gldCFtiIbyaOvTeYBRA3l"` | +| `gen_ai.response.model`| `"gpt-4-0613"` | +| `gen_ai.usage.completion_tokens`| `47` | +| `gen_ai.usage.prompt_tokens`| `52` | +| `gen_ai.response.finish_reason`| `"stop"` | + +Events: + +1. `gen_ai.system.message` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload (full) | `{"role":"system","content":"You're a friendly bot that answers questions about OpenTelemetry."}` | + | Event payload (without sensitive content) | `{"role":"system", "content":"REDACTED"}` | + +2. `gen_ai.user.message` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload (full) | `{"role":"user","content":"How to instrument GenAI library with OTel?"}` | + | Event payload (without sensitive content) | `{"role":"user", "content":"REDACTED"}` | + +3. `gen_ai.choice` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload (full) | `{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Follow GenAI semantic conventions available at opentelemetry.io."}}` | + | Event payload (without sensitive content) | `{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"REDACTED"}}` | + +### Tools + +This example covers the following scenario: + +1. Application requests chat completion from OpenAI GPT-4 model and provides a function definition. + + - Application provides the following prompt: + + - User message: `How to instrument GenAI library with OTel?` + + - Application defines a tool (a function) names `get_link_to_otel_semconv` with single string argument named `semconv` + +2. The model responds with a tool call request which application executes +3. The application requests chat completion again now with the tool execution result + +Here's the telemetry generated for each step in this scenario: + +1. Chat completion resulting in a tool call. + + | Attribute name | Value | + |---------------------|-------------------------------------------------------| + | Span name | `"chat.completion gpt-4"` | + | `gen_ai.system` | `"openai"` | + | `gen_ai.request.model`| `"gpt-4"` | + | `gen_ai.request.max_tokens`| `200` | + | `gen_ai.request.top_p`| `1.0` | + | `gen_ai.response.id`| `"chatcmpl-9J3uIL87gldCFtiIbyaOvTeYBRA3l"` | + | `gen_ai.response.model`| `"gpt-4-0613"` | + | `gen_ai.usage.completion_tokens`| `17` | + | `gen_ai.usage.prompt_tokens`| `47` | + | `gen_ai.response.finish_reason`| `"tool_calls"` | + + Events parented to this span: + + - `gen_ai.user.message` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload | `{"role":"user","content":"How to instrument GenAI library with OTel?"}` | + + - `gen_ai.choice` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload | `{"index":0,"finish_reason":"tool_calls","message":{"role":"assistant","tool_calls":[{"id":"call_VSPygqKTWdrhaFErNvMV18Yl","function":{"name":"get_link_to_otel_semconv","arguments":"{\"semconv\":\"GenAI\"}"},"type":"function"}]}` | + +2. Application executes the tool call. Application may create span which is not covered by this semantic convention. +3. Final chat completion call + + | Attribute name | Value | + |---------------------|-------------------------------------------------------| + | Span name | `"chat.completion gpt-4"` | + | `gen_ai.system` | `"openai"` | + | `gen_ai.request.model`| `"gpt-4"` | + | `gen_ai.request.max_tokens`| `200` | + | `gen_ai.request.top_p`| `1.0` | + | `gen_ai.response.id`| `"chatcmpl-call_VSPygqKTWdrhaFErNvMV18Yl"` | + | `gen_ai.response.model`| `"gpt-4-0613"` | + | `gen_ai.usage.completion_tokens`| `52` | + | `gen_ai.usage.prompt_tokens`| `47` | + | `gen_ai.response.finish_reason`| `"tool_calls"` | -```json -{"role":"system","content":"You're a friendly bot that helps use OpenTelemetry.","name":"bot"} -``` + Events parented to this span: + (in this example, the event content matches the original messages, but applications may also drop messages or change their content) -```json -{"role":"user","content":"What telemetry is reported by OpenAI instrumentations?"} -``` + - `gen_ai.user.message` -```json -{"role":"assistant","content":"Spans, events, metrics that follow GenAI semantic conventions."} -``` + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload | `{"role":"user","content":"How to instrument GenAI library with OTel?"}` | -```json -{"role":"assistant","tool_calls":[{"id":"call_hHM72v9f1JprJBStycQC4Svz","function":{"name":"get_link_to_otel_semconv","arguments":"{\"gen_ai_system\": \"OpenAI\"}"},"type":"function"}]} -``` + - `gen_ai.assistant.message` -Examples of serialized event payload that can be passed in `event.data` attribute: + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload | `{"role":"assistant","tool_calls":[{"id":"call_VSPygqKTWdrhaFErNvMV18Yl","function":{"name":"get_link_to_otel_semconv","arguments":"{\"semconv\":\"GenAI\"}"},"type":"function"}]}` | -```json -{"role":"tool","content":"OpenAI Semantic conventions are available at opentelemetry.io","tool_call_id":"call_BC9hyMlI7if1ZMIH8l1R26Lo"} -``` + - `gen_ai.tool.message` -```json -{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"The OpenAI semantic conventions are available at opentelemetry.io"}} -``` + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload | `{"role":"tool","content":"opentelemetry.io/semconv/gen-ai","tool_call_id":"call_VSPygqKTWdrhaFErNvMV18Yl"}` | -or + - `gen_ai.choice` -```json -{"index":0,"finish_reason":"content_filter","content_filter_results":{"protected_material_text":{"detected":true,"filtered":true}}} -``` + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event payload | `{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Follow OTel semconv available at opentelemetry.io/semconv/gen-ai"}}` | [DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.31.0/specification/document-status.md diff --git a/model/trace/gen-ai.yaml b/model/trace/gen-ai.yaml index 02b6010936..5c78821e8f 100644 --- a/model/trace/gen-ai.yaml +++ b/model/trace/gen-ai.yaml @@ -35,12 +35,16 @@ groups: requirement_level: recommended - ref: gen_ai.usage.completion_tokens requirement_level: recommended + - ref: gen_ai.response.finish_reason + note: > + If there is more than one finish reason in the response, the last one should be reported. + requirement_level: recommended events: - gen_ai.system.message - gen_ai.user.message - gen_ai.assistant.message - gen_ai.tool.message - - gen_ai.response.message + - gen_ai.choice - id: gen_ai.common.event.attributes type: attribute_group @@ -53,7 +57,7 @@ groups: name: gen_ai.system.message type: event brief: > - This event describes the instructions passed to the Gen AI system inside the prompt. + This event describes the instructions passed to the GenAI system inside the prompt. extends: gen_ai.common.event.attributes - id: gen_ai.user.message @@ -67,7 +71,7 @@ groups: name: gen_ai.assistant.message type: event brief: > - This event describes the assistant message when it's passed in the prompt. + This event describes the assistant message passed to GenAI system or received from it. extends: gen_ai.common.event.attributes - id: gen_ai.tool.message @@ -77,11 +81,10 @@ groups: This event describes the tool or function response message. extends: gen_ai.common.event.attributes - - id: gen_ai.response.message - name: gen_ai.response.message + - id: gen_ai.choice + name: gen_ai.choice type: event brief: > This event describes the Gen AI response message. extends: gen_ai.common.event.attributes - attributes: - - ref: gen_ai.response.finish_reason +