Incompatibilities with OpenTelemetry LLM semantics pending release #26

codefromthecrypt · 2024-07-24T06:10:43Z

I work on the OpenTelemetry LLM semantics SIG, and did an evaluation of the SDK based on the following sample code and what the semantics pending release 1.27.0 will define.

Note: I'm doing this unsolicited on all the various python instrumentation for openai, so this is not a specific call out that AGIFlow is notably different here. I wanted to warn you about some drift and ideally you'll be in a position to adjust once the release occurs, or clarify if that's not a goal. I would welcome you to join the #otel-llm-semconv-wg slack and any SIG meetings if you find this relevant!

Sample code

import os
from agiflow import Agiflow
from openai import OpenAI
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

# Initialize otel exporter and AGIFlow instrumentation
app_name = "agiflow-python-ollama"
otlp_endpoint = os.getenv("OTEL_EXPORTER_OTLP_TRACES_ENDPOINT", "http://localhost:4318/v1/traces")
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint)
Agiflow.init(app_name=app_name, exporter=otlp_exporter)

def main():
    ollama_host = os.getenv('OLLAMA_HOST', 'localhost')
    # Use the OpenAI endpoint, not the Ollama API.
    base_url = 'http://' + ollama_host + ':11434/v1'
    client = OpenAI(base_url=base_url, api_key='unused')
    messages = [
      {
        'role': 'user',
        'content': '<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>',
      },
    ]
    chat_completion = client.chat.completions.create(model='codegemma:2b-code', messages=messages)
    print(chat_completion.choices[0].message.content)

if __name__ == "__main__":
    main()

Evaluation

Semantic evaluation on spans.

compatible:

kind=Client

missing:

attributes['gen_ai.operation.name']='chat'
attributes['gen_ai.system]='ollama'

incompatible:

name=Completions (should be 'chat codegemma:2b-code')
attributes['lllm.type']='Chat' (should be 'gen_ai.operation.name' and lowercase)
attributes['llm.prompts']='[{"role": "user", "content": "<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>"}]' (should be the event attribute 'gen_ai.prompt')
attributes['llm.model']='codegemma:2b-code' (should be 'gen_ai.request.model')
attributes['llm.responses']='[{"content": "print("Hello, world!")", "role": "assistant"}]' (should be the event attribute 'gen_ai.completion')
attributes['llm.token.counts']='{"prompt_tokens": 24, "completion_tokens": 12, "total_tokens": 36}' (should be split into 'gen_ai.usage.input_tokens' and 'gen_ai.usage.output_tokens')

not yet defined in the standard:

attributes['openai.api_base']='http://localhost:11434/v1/'
attributes['llm.api']='/chat/completions'
attributes['llm.system.fingerprint']='fp_ollama'

defined by other semantics:

attributes['url.full']='http://localhost:11434/v1/'

vendor specific:

attributes['agiflow.sdk.name']='agiflow-python-sdk'
attributes['agiflow.sdk.version']='0.0.23'
attributes['agiflow.service.name']='OpenAI'
attributes['agiflow.service.type']='LLM'
attributes['agiflow.service.version']='1.37.0'

Semantic evaluation on metrics:

N/A as no metrics are currently recorded

Example collector log

otel-collector      | 2024-07-24T05:08:59.563Z  info    TracesExporter  {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 1}
otel-collector      | 2024-07-24T05:08:59.563Z  info    ResourceSpans #0
otel-collector      | Resource SchemaURL: 
otel-collector      | Resource attributes:
otel-collector      |      -> service.name: Str(agiflow-python-ollama)
otel-collector      |      -> service.version: Str()
otel-collector      |      -> telemetry.sdk.name: Str(AGIFlow)
otel-collector      |      -> telemetry.sdk.version: Str(0.0.23)
otel-collector      | ScopeSpans #0
otel-collector      | ScopeSpans SchemaURL: 
otel-collector      | InstrumentationScope agiflow.opentelemetry.instrumentation.openai.instrumentation 0.0.23
otel-collector      | Span #0
otel-collector      |     Trace ID       : 3d39854a707e30493a3400ba75d0cfc0
otel-collector      |     Parent ID      : 
otel-collector      |     ID             : de037108c788013f
otel-collector      |     Name           : Completions
otel-collector      |     Kind           : Client
otel-collector      |     Start time     : 2024-07-24 05:08:58.820558 +0000 UTC
otel-collector      |     End time       : 2024-07-24 05:08:59.4852 +0000 UTC
otel-collector      |     Status code    : Ok
otel-collector      |     Status message : 
otel-collector      | Attributes:
otel-collector      |      -> agiflow.sdk.name: Str(agiflow-python-sdk)
otel-collector      |      -> agiflow.sdk.version: Str(0.0.23)
otel-collector      |      -> agiflow.service.name: Str(OpenAI)
otel-collector      |      -> agiflow.service.type: Str(LLM)
otel-collector      |      -> agiflow.service.version: Str(1.37.0)
otel-collector      |      -> openai.api_base: Str(http://localhost:11434/v1/)
otel-collector      |      -> url.full: Str(http://localhost:11434/v1/)
otel-collector      |      -> llm.api: Str(/chat/completions)
otel-collector      |      -> llm.type: Str(Chat)
otel-collector      |      -> llm.prompts: Str([{"role": "user", "content": "<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>"}])
otel-collector      |      -> llm.model: Str(codegemma:2b-code)
otel-collector      |      -> llm.responses: Str([{"content": "print(\"Hello, world!\")", "role": "assistant"}])
otel-collector      |      -> llm.system.fingerprint: Str(fp_ollama)
otel-collector      |      -> llm.token.counts: Str({"prompt_tokens": 24, "completion_tokens": 12, "total_tokens": 36})
otel-collector      |   {"kind": "exporter", "data_type": "traces", "name": "debug"}

The text was updated successfully, but these errors were encountered:

vuongngo · 2024-07-24T12:27:18Z

Thanks @codefromthecrypt , really appreciate your time running evaluation on agiflow-sdk. It's definitely our goal to keep the telemetry adhere to standard. Thanks for pointing us to the right direction, will get the next few release aligned with the semantic release.

vuongngo · 2024-08-02T08:49:05Z

Hi @codefromthecrypt , I've created a PR that fixes the incompatibility with GenAI semconv. Would you mind giving this branch a quick test or let me know how to run the check to make it easier?
Also I'm a bit confused which identifier should be given to gen_ai.system, is it bounded to vendor name or library name? And should it be added to API span only?

codefromthecrypt · 2024-08-04T08:01:05Z

sorry about missing this

for the span, this is the logical span representing say an openai call. Ack that there is an http call underneath the openai library abstraction. Right now, I didn't notice any subspans. So, basically the span representing the library call. If recently you also add http child span, that's cool, just the spec is about the application layer one.

for gen_ai.system, the docs currently have this (I'll add ollama at some point soon)

gen_ai.system has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description
anthropic	Anthropic
cohere	Cohere
openai	OpenAI
vertex_ai	Vertex AI
--

For testing I've been using the code above pasted into the description. As I'm not sure how to add a pip dep on a branch, you could either run the code and paste collector output, or tell me how to use your branch. I use this pipfile

url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
agiflow-sdk = "*"
openai = "*"

[dev-packages]

[requires]
python_version = "3.12"

vuongngo · 2024-08-04T22:54:22Z

Thanks @codefromthecrypt , that sounds great to me.

Automatic http tracing is currently not support as we're having lots of empty trace on Azure function. Currently we support automatic traces only on LLM libraries; customers can add http traces via extra_instrumentations arguments when initialize the library.

I've added sample app to app/agiflow-sdk-samples with the script you provided. Hope that help!

Also notice from the working group messages, I think for now will leave prompt/completion captured on gen_ai.promp and gen_ai.completion and will add events support in another release.

codefromthecrypt · 2024-08-05T00:49:28Z

thanks I'll follow-up more here, but you may want to look at open-telemetry/semantic-conventions#1315 (comment) I don't remember if you had already. Possibly you can comment your experience on this topic even if it doesn't end up being about internal vs client span kind

codefromthecrypt · 2024-08-05T04:39:31Z

Thanks and personally done here until something else. Your PR is very close.

made a comment in that PR, noting your exception on the span events -> attributes part. this isn't different than openllmetry who also don't follow span events at the moment, except choices of how to represent attributes

This mainly would impact backend portability as when key data is done differently, it is hard for folks to make portable visualization or analysis tools. Event api could land soon, but it also could be a very long way away, and even longer for all backends to use it. So, basically this lack of portability will last at least that long.

I would expect that knowing this regardless of what the spec says, even when (log) event api exists, instrumentation might have a config toggle to use span events. I would bet $5 but not the house ;)

Anyway, between now and then, those really wanting to normalize on this point could rewrite the data in a custom exporter or in the collector, maybe transformprocessor, knowing the data layout basically.

vuongngo · 2024-08-05T12:45:15Z

Yes, I've updated to support span events by now. Checked our backend code, should be simple to support span events.

I will checkout the github issue between INTERNAL and CLIENT span soon, thanks for sharing that.

vuongngo · 2024-08-06T11:13:24Z

@codefromthecrypt , thanks again for your help! agiflow-sdk v0.0.24 is released with gen_ai semconv fixes.

vuongngo mentioned this issue Aug 2, 2024

Fix/otel genai #27

Merged

vuongngo closed this as completed in #27 Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incompatibilities with OpenTelemetry LLM semantics pending release #26

Incompatibilities with OpenTelemetry LLM semantics pending release #26

codefromthecrypt commented Jul 24, 2024 •

edited

Loading

vuongngo commented Jul 24, 2024

vuongngo commented Aug 2, 2024

codefromthecrypt commented Aug 4, 2024

vuongngo commented Aug 4, 2024

codefromthecrypt commented Aug 5, 2024

codefromthecrypt commented Aug 5, 2024

vuongngo commented Aug 5, 2024

vuongngo commented Aug 6, 2024

Incompatibilities with OpenTelemetry LLM semantics pending release #26

Incompatibilities with OpenTelemetry LLM semantics pending release #26

Comments

codefromthecrypt commented Jul 24, 2024 • edited Loading

Sample code

Evaluation

Example collector log

vuongngo commented Jul 24, 2024

vuongngo commented Aug 2, 2024

codefromthecrypt commented Aug 4, 2024

vuongngo commented Aug 4, 2024

codefromthecrypt commented Aug 5, 2024

codefromthecrypt commented Aug 5, 2024

vuongngo commented Aug 5, 2024

vuongngo commented Aug 6, 2024

codefromthecrypt commented Jul 24, 2024 •

edited

Loading