How to get the "raw" prompt being sent to OpenAI #1239

roalexan · 2023-05-26T14:59:01Z

roalexan
May 26, 2023

Hi! For debugging purposes, I want to be able to get the "raw" prompt that is being sent to OpenAI. I use a prompt template where I set context variables on it. That, in addition to the SK function, is what is used to assemble the actual raw string that gets sent to OpenAI, e.g.:

var response = await this.kernel.RunAsync(contextVariables, sKFuncttion);

What I'm asking is: is there a way to get that fully assembled string that SK is going to send? For now, I'm loading the prompt template and substituting in the context variables myself and assembling the string in the same way that I think OpenAI is under-the-hood, but I'd much prefer to grab it from SK itself.

Answered by dmytrostruk

May 16, 2024

It already is. If there are things that don't work with it and it's important that they do, we need to fix them to work with it.

As far as I can see, the main reason why developers use IChatCompletionService directly is because they want access to ChatHistory object directly, and Kernel doesn't work with this object yet. I believe if we add support for it, then there would be no reason to do kernel.GetRequiredService<IChatCompletionService>() for simple cases. It will be possible to do kernel.InvokeAsync(chatHistory), and then all features, including extended telemetry, will work as expected, without need to duplicate prompt logging in all connectors. And we still can have logging for O…

View full answer

fikriauliya · 2023-06-22T07:04:14Z

fikriauliya
Jun 22, 2023

I'm curious about this as well. Did you manage to find the answer? @roalexan

0 replies

benjacernuda · 2023-07-30T08:12:24Z

benjacernuda
Jul 30, 2023

@roalexan @fikriauliya Do you know if there is any progress with this?
For debugging purposes and understand better what is going on behind the scenes I'm trying see all the raw calls that SK is doing. not only the OpenAI, but the vector search, etc...

I have tried to use Consolelogger and I can see some explanation of the logic that SK is applying in the process but I would be great if we can see the raw calls to the different services

Thanks!

1 reply

roalexan Jul 30, 2023
Author

@fikriauliya @benjacernuda I have nothing new to add at the moment - perhaps someone from the Semantic Kernel Team could weigh in.

craigomatic · 2023-07-31T21:10:55Z

craigomatic
Jul 31, 2023

@markwallace-microsoft may have some thoughts

0 replies

benjacernuda · 2023-08-01T07:51:36Z

benjacernuda
Aug 1, 2023

@roalexan @fikriauliya @craigomatic Researching about this I have found this telemetry module and this video explaning the way to send the traces to App Insights.

It seems that is the way to get more detailed logs, but I haven't had time to try it in my application

Hope it helps

0 replies

lemillermicrosoft · 2023-08-05T01:47:06Z

lemillermicrosoft
Aug 5, 2023
Collaborator

In a personal project, I implemented ITextCompletion that echos the input (the prompt) as the result.

var PRSkill = this.kernel.ImportSemanticSkillFromDirectory(folder, SEMANTIC_FUNCTION_PATH);

this._kernel = Kernel.Builder
                .WithAIService<ITextCompletion>(null, new RedirectTextCompletion(), true)
                .Build();
                
// The kernels should match basically
this._kernel.ImportSemanticSkillFromDirectory(folder, SEMANTIC_FUNCTION_PATH);

So you could then do something like this:

var redirectSkFunction = this._redirectKernel.Skills.GetFunction(skFunction.SkillName, skFunction.Name);
var prompt = (await redirectSkFunction.InvokeAsync(contextVariables)).Result;

var response = await this.kernel.RunAsync(contextVariables, sKFunction);

More details on RedirectTextCompletion:
https://github.com/lemillermicrosoft/skonsole/blob/b69aed74e095a0433a21325747875e770090330a/skills/PRSkill/PullRequestSkill.cs#L183

0 replies

douglasware · 2023-08-28T16:15:56Z

douglasware
Aug 28, 2023

SK is disappointing in this regard at the moment.

1 reply

amadorcarlosa Sep 10, 2023

I dont think there is a native way to do this but I public MyPrompt(MyPlugins myPlugins, MyFunction myFunction)
{
var path = $"wwwroot/{myPlugins}/{myFunction}/";
var json= File.ReadAllText($"{path}config.json");
PromptConfig= PromptTemplateConfig.FromJson(json);
SkPrompt = File.ReadAllText($"{path}skPrompt.txt");
ContextInputs = new Dictionary<string, string>();
var lKeys=GetContextInputs();
foreach(var key in lKeys)
{
ContextInputs.Add(key, "");
}
Console.WriteLine("MyPrompt Created");
Console.WriteLine(ContextInputs.Count);

} eventually I use this to public static string MergeMessage(ChatInput chatInput) {
var stringBuilder = new StringBuilder();

     var prompt = chatInput.Prompt;
     var contextInputs = chatInput.ContextInputs;
     var response = MergePromptWithInputs(prompt, contextInputs);
     stringBuilder.Append(response);
     return stringBuilder.ToString();

}

evchaki · 2024-01-03T23:01:21Z

evchaki
Jan 3, 2024
Collaborator

You can use our telemetry feature to see the prompts - https://devblogs.microsoft.com/semantic-kernel/track-your-token-usage-and-costs-with-semantic-kernel/

0 replies

HuskyDanny · 2024-01-15T05:29:34Z

HuskyDanny
Jan 15, 2024

Bring this back, do we have a way to see the raw prompt? In langchain, you could set set_debug(True) and at least you could see all the intermediate steps, prompts, responses on local.

If not, how do we debug the response accuracy issue?

1 reply

dmytrostruk Apr 25, 2024
Maintainer

@HuskyDanny Please see my comment below:
#1239 (reply in thread)

ameier38 · 2024-01-22T16:31:36Z

ameier38
Jan 22, 2024

If anybody is still having trouble I was able to see the raw requests by hooking up the HttpClientInstrumentation and then enriching with the request and response bodies. By default the request and response bodies are not in the span attributes. You probably shouldn't do this in production but it is nice for working locally and understand what is going on under the hood. Below is my F# script.

#r "nuget: OpenTelemetry"
#r "nuget: OpenTelemetry.Instrumentation.Http"
#r "nuget: OpenTelemetry.Exporter.Console"
#r "nuget: Microsoft.Extensions.Logging"
#r "nuget: Microsoft.Extensions.Logging.Console"
#r "nuget: Microsoft.SemanticKernel"

open Microsoft.Extensions.Logging
open Microsoft.Extensions.DependencyInjection
open Microsoft.SemanticKernel
open Microsoft.SemanticKernel.ChatCompletion
open Microsoft.SemanticKernel.Connectors.OpenAI
open OpenTelemetry
open OpenTelemetry.Resources
open OpenTelemetry.Logs
open OpenTelemetry.Trace
open System.ComponentModel
open System
open System.Diagnostics
open System.IO
open System.Net.Http

module Env =
    let variable key =
        match Environment.GetEnvironmentVariable(key) with
        | s when String.IsNullOrEmpty(s) -> failwith $"Environment variable {key} is not set"
        | value -> value

type MathPlugin(loggerFactory:ILoggerFactory) =
    let logger = loggerFactory.CreateLogger<MathPlugin>()

    [<KernelFunction>]
    [<Description("Add two numbers")>]
    member _.Add
        ([<Description("The first number")>] first:float,
         [<Description("The second number")>] second:float)
         : [<Description("The first number plus the second number")>] float =
         logger.LogDebug("Adding {first} and {second}", first, second)
         first + second

let resourceBuilder = ResourceBuilder.CreateDefault().AddService("Scratch")

let enrichHttpRequest (activity:Activity) (req:HttpRequestMessage) =
    req.Content.LoadIntoBufferAsync().Wait()
    let ms = new MemoryStream()
    req.Content.CopyToAsync(ms).Wait()
    ms.Seek(0L, SeekOrigin.Begin) |> ignore
    use reader = new StreamReader(ms)
    let content = reader.ReadToEnd()
    activity.SetTag("requestBody", content) |> ignore

let enrichHttpResponse (activity:Activity) (res:HttpResponseMessage) =
    res.Content.LoadIntoBufferAsync().Wait()
    let ms = new MemoryStream()
    res.Content.CopyToAsync(ms).Wait()
    ms.Seek(0L, SeekOrigin.Begin) |> ignore
    use reader = new StreamReader(ms)
    let content = reader.ReadToEnd()
    activity.SetTag("responseBody", content) |> ignore

let tracerProvider =
    Sdk.CreateTracerProviderBuilder()
        .SetResourceBuilder(resourceBuilder)
        .AddSource("Scratch") // subscribe to Scratch resource (created above) traces
        .AddSource("Microsoft.SemanticKernel*") // subscribe to semantic kernel traces
        .AddHttpClientInstrumentation(fun opts ->
            opts.EnrichWithHttpRequestMessage <- Action<Activity,HttpRequestMessage>(enrichHttpRequest)
            opts.EnrichWithHttpResponseMessage <- Action<Activity,HttpResponseMessage>(enrichHttpResponse))
        .AddConsoleExporter()
        .Build()
let loggerFactory =
    LoggerFactory.Create(fun builder ->
        builder
            .SetMinimumLevel(LogLevel.Trace)
            .AddOpenTelemetry(fun opts ->
                opts.SetResourceBuilder(resourceBuilder) |> ignore
                opts.AddConsoleExporter() |> ignore // export logs to console
                opts.IncludeFormattedMessage <- true)
            |> ignore)
let apiKey = Env.variable "OPENAI_API_KEY"
let builder = Kernel.CreateBuilder()
builder.Services.AddSingleton(loggerFactory)
builder.Services.AddOpenAIChatCompletion("gpt-3.5-turbo-1106", apiKey)
builder.Plugins.AddFromType<MathPlugin>()
let kernel = builder.Build()

let tracer = tracerProvider.GetTracer("Scratch")
let settings = OpenAIPromptExecutionSettings(ToolCallBehavior=ToolCallBehavior.AutoInvokeKernelFunctions)
let chatService = kernel.Services.GetRequiredService<IChatCompletionService>()
let history = ChatHistory("You are a friendly analyst. You can add numbers.")
history.AddUserMessage("What is 14.4 + 24.6?")

let span = tracer.StartActiveSpan("Solving math problem")
let res = chatService.GetChatMessageContentAsync(history, settings, kernel).Result
span.Dispose()
printfn $"Response: {res}"

0 replies

TaoChenOSU · 2024-04-17T18:05:50Z

TaoChenOSU
Apr 17, 2024
Maintainer

If you'd like to view the raw prompt that's sent to Azure OpenAI, you will need to enable trace level logging as prompts and completions may contain PII. For more information, please see: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/docs/TELEMETRY.md#logging

18 replies

GKrivosheev-rms May 16, 2024

Thanks, Dmitro,
I agree with Stephen. I thought the point of SK was to abstract OpenAI messages to the level of tools/tasks/chats etc. If I need to operate on raw prompts and send them to SK, then why do I need the kernel at all? I might just drop to the OpenAI SDK level and manage my own messages by hand.

dmytrostruk May 16, 2024
Maintainer

The argument you're making is basically don't use any service (eg IChatCompletionService) because the implementations aren't currently logging enough.

I think it's not only about telemetry, using Kernel allows you to use prompt template engine, filters, perform AI service selection in runtime, failover to other LLM in case of failure and so on. And extended telemetry is another aspect of it. If we add support for chat history directly in Kernel, it will work as main entry point to LLM. And if I want to use already existing connector or create my own one, it would be nice to have all of this for my custom connector by default.

The fix is to log enough, not avoid using a lot of what SK provides.

I agree, but it depends on what level. We can duplicate all logging in all connectors, or we can put it in common place, and it will work for all connectors at the same time, including any custom connector that implements IChatCompletionService. In similar way we are working on abstracting function calling feature at the moment.

stephentoub May 16, 2024
Collaborator

If we add support for chat history directly in Kernel, it will work as main entry point to LLM.

It already is. If there are things that don't work with it and it's important that they do, we need to fix them to work with it.

dmytrostruk May 16, 2024
Maintainer

It already is. If there are things that don't work with it and it's important that they do, we need to fix them to work with it.

As far as I can see, the main reason why developers use IChatCompletionService directly is because they want access to ChatHistory object directly, and Kernel doesn't work with this object yet. I believe if we add support for it, then there would be no reason to do kernel.GetRequiredService<IChatCompletionService>() for simple cases. It will be possible to do kernel.InvokeAsync(chatHistory), and then all features, including extended telemetry, will work as expected, without need to duplicate prompt logging in all connectors. And we still can have logging for OpenAI/Google/HuggingFace-specific information in connectors.

Answer selected by sophialagerkranspandey

stephentoub May 16, 2024
Collaborator

for simple cases

Developers with non-simple cases still need all the same support, even more so.

It will be possible to do kernel.InvokeAsync(chatHistory), and then all features, including extended telemetry, will work as expected

What about everything the connector adds to the ChatHistory as part of the IChatCompletionService operation, like all the intermediate messages representing the function calling responses and requests? How is the simple shared InvokeAsync going to communicate that back? Today it's thrown away.

What about when someone wants to do text-to-image... what is InvokeAsync going to do? And image to text? And text to audio?

Etc.

The answer can't be "use kernel.InvokeAsync if you care about logging."

dmytrostruk May 16, 2024
Maintainer

What about everything the connector adds to the ChatHistory as part of the IChatCompletionService operation, like all the intermediate messages representing the function calling responses and requests? How is the simple shared InvokeAsync going to communicate that back?

We are modifying caller's chat history today in-place. It's working when calling IChatCompletionService directly, so I would expect similar behavior for InvokeAsync:

semantic-kernel/dotnet/src/Connectors/Connectors.OpenAI/AzureSdk/ClientCore.cs

Lines 453 to 458 in 9b0dde5

    
           // Add the original assistant message to the chatOptions; this is required for the service 
        
           // to understand the tool call responses. Also add the result message to the caller's chat 
        
           // history: if they don't want it, they can remove it, but this makes the data available, 
        
           // including metadata like usage. 
        
           chatOptions.Messages.Add(GetRequestMessage(resultChoice.Message)); 
        
           chat.Add(result);

We can also use FunctionResult.GetValue<T> as alternative option.

What about when someone wants to do text-to-image... what is InvokeAsync going to do? And image to text? And text to audio?

I think the long-term goal would be to support these as well. Kernel will allow to use one API between different AI providers with minimal/none code changes required to switch between them.

The answer can't be "use kernel.InvokeAsync if you care about logging."

100% agree :) I'm just trying to understand what we can do to make kernel.InvokeAsync just work without additional interaction with services inside. Of course it's not going to be possible to cover all cases, but since the question was about prompt logging, I think that can be achieved with kernel for all connectors.

dmytrostruk May 16, 2024
Maintainer

I'm just trying to understand what we can do to make kernel.InvokeAsync just work without additional interaction with services inside.

Maybe the answer would be to use abstract class ChatCompletionService instead of IChatCompletionService interface, and base behavior will always do the logging for all connectors. Then, it will work both when using service directly or kernel. Would be interesting to explore this idea further.

xckai · 2024-04-25T09:26:32Z

xckai
Apr 25, 2024

this QA or issue is not solved,why closed ???

7 replies

dmytrostruk May 8, 2024
Maintainer

I tried 3rd option but it only print the initial prompt, even if it's using 2 funcion calling during it's execution (I can see the function that are invoked and their result but not the inner used prompt)

@MithrilMan It depends on which type of function is executed during function calling. If it's native/method function (e.g. marked with KernelFunction attribute or created with KernelFunctionFactory.CreateFromMethod), then there will be no rendered prompt available, because this type of function doesn't have any text prompt. Rendered prompt will be available only for semantic/prompt functions (e.g. KernelFunctionFactory.CreateFromPrompt or skprompt.txt/config.json).

For native functions, you can observe function arguments and result. With IAutoFunctionInvocationFilter you can also observe ChatHistory which may contain more information about your function calling process.

MithrilMan May 8, 2024

@dmytrostruk it's a KernelFunction that I add to the kernel with AddFromType passing the type containing it
I would expect to easily see how my function return get passed back to OpenAI, I already tried the approach of IAutoFunctionInvocationFilter but it's output is not easily available to log because it's a complex object that doesn't have a proper json converter.

Maybe at the end the best approach is to inject a custom httpclient and log the request/response data

dmytrostruk May 8, 2024
Maintainer

it's a KernelFunction that I add to the kernel with AddFromType passing the type containing it

@MithrilMan In this case, instead of prompt, you are operating with function arguments and function result.

I would expect to easily see how my function return get passed back to OpenAI, I already tried the approach of IAutoFunctionInvocationFilter but it's output is not easily available to log because it's a complex object that doesn't have a proper json converter.

If your function result is complex object, in all cases it's serialized automatically in order to pass it back to OpenAI. If your return type doesn't have JSON converter, then default serialization logic will be applied.

Maybe at the end the best approach is to inject a custom httpclient and log the request/response data

It depends on what exactly you want to log. IAutoFunctionInvocationFilter allows you to log specific properties of your return type or entire object (e.g. you can use JsonSerializer.Serialize, even if you don't have JSON converter).

Custom HttpClient will allow you to log entire request/response objects, including HTTP-related information like endpoint, headers etc. If it's your goal, then yes, IAutoFunctionInvocationFilter doesn't have this information available. If your goal is to log just function result, which will be passed to OpenAI, then IAutoFunctionInvocationFilter or IFunctionInvocationFilter should help you to achieve this scenario.

It would be really helpful if you can share the example of result object you want to log, and what do you expect to see in your logs. Based on that, I can try to prepare some examples. Thanks!

MithrilMan May 8, 2024

@dmytrostruk the goal is to have a proper way to inspect the real traffic going to/from openai, we use serilog+seq so I tried the structural log approach but serializing with {@ChatHistory} failed (now I can't run the code, maybe i'll test it again tomorrow)

I've pasted some code in this discussion I created yesterday
#6146

dmytrostruk May 8, 2024
Maintainer

the goal is to have a proper way to inspect the real traffic going to/from openai

@MithrilMan If by traffic you mean HTTP-traffic, then the recommendation would be to use custom HttpClient with your logging logic.

we use serilog+seq so I tried the structural log approach but serializing with {@ChatHistory} failed

ChatHistory object is serializable. I tried to log ChatHistory using your example and it works as expected on my side. Maybe you could share error details for better understanding.

GKrivosheev-rms · 2024-07-08T23:33:51Z

GKrivosheev-rms
Jul 8, 2024

Looks like the issue is fixed in latest version (validated in v 1.14.1). Thanks so much: @dmytrostruk & @stephentoub!

0 replies

acmoles · 2024-09-17T10:57:09Z

acmoles
Sep 17, 2024

Running through the examples, I also wanted to inspect the hydrated prompt but found the API lacking.

Here's my use case:

Given this code from the memory example,

const string skPrompt = @"
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.

Information about me, from previous conversations:
- {{$fact1}} {{recall $fact1}}
- {{$fact2}} {{recall $fact2}}
- {{$fact3}} {{recall $fact3}}
- {{$fact4}} {{recall $fact4}}
- {{$fact5}} {{recall $fact5}}

Chat:
{{$history}}
User: {{$userInput}}
ChatBot: ";

var chatFunction = kernel.CreateFunctionFromPrompt(skPrompt, new OpenAIPromptExecutionSettings { MaxTokens = 200, Temperature = 0.8 });

var arguments = new KernelArguments();

arguments["fact1"] = "what is my name?";
arguments["fact2"] = "where do I live?";
arguments["fact3"] = "where is my family from?";
arguments["fact4"] = "where have I travelled?";
arguments["fact5"] = "what do I do for work?";

arguments[TextMemoryPlugin.CollectionParam] = MemoryCollectionName;
arguments[TextMemoryPlugin.LimitParam] = "2";
arguments[TextMemoryPlugin.RelevanceParam] = "0.8";

var history = "";
arguments["history"] = history;

I want to do something like

Console.WriteLine(chatFunction.GetPrompt(kernel, arguments));

to inspect how the prompt renders while I iterate.

For example, I might be unhappy with how history is formatted, or the memories returned from the vector store.

I do this kind of inspection routinely when working with my own strings and it seems like an oversight that this should be any harder than above with SK, even more so if I need to make an LLM request to view it from telemetry.

Working with notebooks, C# and SK was a magical experience until this moment.

1 reply

TaoChenOSU Sep 17, 2024
Maintainer

Hi @acmoles,

I see two scenarios your query may imply:

You want to see the prompts after you run your program.
You want to see the prompts before it gets sent to the model. It only continues the execution after your approval.

For the second scenario, you may want to use a filter. @dmytrostruk may have more information on how to achieve that.

For the first scenario, you need to rely on telemetry data. Please refer to our learn site for this option: https://learn.microsoft.com/en-us/semantic-kernel/enterprise-readiness/observability/?pivots=programming-language-csharp

How to get the "raw" prompt being sent to OpenAI #1239

Replies: 13 comments · 29 replies

roalexan Jul 30, 2023 Author

lemillermicrosoft Aug 5, 2023 Collaborator

evchaki Jan 3, 2024 Collaborator

dmytrostruk Apr 25, 2024 Maintainer

TaoChenOSU Apr 17, 2024 Maintainer

dmytrostruk May 16, 2024 Maintainer

stephentoub May 16, 2024 Collaborator

dmytrostruk May 16, 2024 Maintainer

stephentoub May 16, 2024 Collaborator

dmytrostruk May 16, 2024 Maintainer

dmytrostruk May 16, 2024 Maintainer

dmytrostruk May 8, 2024 Maintainer

dmytrostruk May 8, 2024 Maintainer

dmytrostruk May 8, 2024 Maintainer

TaoChenOSU Sep 17, 2024 Maintainer

Replies: 13 comments 29 replies

roalexan Jul 30, 2023
Author

lemillermicrosoft
Aug 5, 2023
Collaborator

evchaki
Jan 3, 2024
Collaborator

dmytrostruk Apr 25, 2024
Maintainer

TaoChenOSU
Apr 17, 2024
Maintainer

dmytrostruk May 16, 2024
Maintainer

stephentoub May 16, 2024
Collaborator

dmytrostruk May 16, 2024
Maintainer

stephentoub May 16, 2024
Collaborator

dmytrostruk May 16, 2024
Maintainer

dmytrostruk May 16, 2024
Maintainer

dmytrostruk May 8, 2024
Maintainer

dmytrostruk May 8, 2024
Maintainer

dmytrostruk May 8, 2024
Maintainer

TaoChenOSU Sep 17, 2024
Maintainer