How to get the "raw" prompt being sent to OpenAI #1239
-
Hi! For debugging purposes, I want to be able to get the "raw" prompt that is being sent to OpenAI. I use a prompt template where I set context variables on it. That, in addition to the SK function, is what is used to assemble the actual raw string that gets sent to OpenAI, e.g.:
What I'm asking is: is there a way to get that fully assembled string that SK is going to send? For now, I'm loading the prompt template and substituting in the context variables myself and assembling the string in the same way that I think OpenAI is under-the-hood, but I'd much prefer to grab it from SK itself. |
Beta Was this translation helpful? Give feedback.
Replies: 13 comments 29 replies
-
I'm curious about this as well. Did you manage to find the answer? @roalexan |
Beta Was this translation helpful? Give feedback.
-
@roalexan @fikriauliya Do you know if there is any progress with this? I have tried to use Consolelogger and I can see some explanation of the logic that SK is applying in the process but I would be great if we can see the raw calls to the different services Thanks! |
Beta Was this translation helpful? Give feedback.
-
@markwallace-microsoft may have some thoughts |
Beta Was this translation helpful? Give feedback.
-
@roalexan @fikriauliya @craigomatic Researching about this I have found this telemetry module and this video explaning the way to send the traces to App Insights. It seems that is the way to get more detailed logs, but I haven't had time to try it in my application Hope it helps |
Beta Was this translation helpful? Give feedback.
-
In a personal project, I implemented
So you could then do something like this:
More details on RedirectTextCompletion: |
Beta Was this translation helpful? Give feedback.
-
SK is disappointing in this regard at the moment. |
Beta Was this translation helpful? Give feedback.
-
You can use our telemetry feature to see the prompts - https://devblogs.microsoft.com/semantic-kernel/track-your-token-usage-and-costs-with-semantic-kernel/ |
Beta Was this translation helpful? Give feedback.
-
Bring this back, do we have a way to see the raw prompt? In langchain, you could set If not, how do we debug the response accuracy issue? |
Beta Was this translation helpful? Give feedback.
-
If anybody is still having trouble I was able to see the raw requests by hooking up the HttpClientInstrumentation and then enriching with the request and response bodies. By default the request and response bodies are not in the span attributes. You probably shouldn't do this in production but it is nice for working locally and understand what is going on under the hood. Below is my F# script. #r "nuget: OpenTelemetry"
#r "nuget: OpenTelemetry.Instrumentation.Http"
#r "nuget: OpenTelemetry.Exporter.Console"
#r "nuget: Microsoft.Extensions.Logging"
#r "nuget: Microsoft.Extensions.Logging.Console"
#r "nuget: Microsoft.SemanticKernel"
open Microsoft.Extensions.Logging
open Microsoft.Extensions.DependencyInjection
open Microsoft.SemanticKernel
open Microsoft.SemanticKernel.ChatCompletion
open Microsoft.SemanticKernel.Connectors.OpenAI
open OpenTelemetry
open OpenTelemetry.Resources
open OpenTelemetry.Logs
open OpenTelemetry.Trace
open System.ComponentModel
open System
open System.Diagnostics
open System.IO
open System.Net.Http
module Env =
let variable key =
match Environment.GetEnvironmentVariable(key) with
| s when String.IsNullOrEmpty(s) -> failwith $"Environment variable {key} is not set"
| value -> value
type MathPlugin(loggerFactory:ILoggerFactory) =
let logger = loggerFactory.CreateLogger<MathPlugin>()
[<KernelFunction>]
[<Description("Add two numbers")>]
member _.Add
([<Description("The first number")>] first:float,
[<Description("The second number")>] second:float)
: [<Description("The first number plus the second number")>] float =
logger.LogDebug("Adding {first} and {second}", first, second)
first + second
let resourceBuilder = ResourceBuilder.CreateDefault().AddService("Scratch")
let enrichHttpRequest (activity:Activity) (req:HttpRequestMessage) =
req.Content.LoadIntoBufferAsync().Wait()
let ms = new MemoryStream()
req.Content.CopyToAsync(ms).Wait()
ms.Seek(0L, SeekOrigin.Begin) |> ignore
use reader = new StreamReader(ms)
let content = reader.ReadToEnd()
activity.SetTag("requestBody", content) |> ignore
let enrichHttpResponse (activity:Activity) (res:HttpResponseMessage) =
res.Content.LoadIntoBufferAsync().Wait()
let ms = new MemoryStream()
res.Content.CopyToAsync(ms).Wait()
ms.Seek(0L, SeekOrigin.Begin) |> ignore
use reader = new StreamReader(ms)
let content = reader.ReadToEnd()
activity.SetTag("responseBody", content) |> ignore
let tracerProvider =
Sdk.CreateTracerProviderBuilder()
.SetResourceBuilder(resourceBuilder)
.AddSource("Scratch") // subscribe to Scratch resource (created above) traces
.AddSource("Microsoft.SemanticKernel*") // subscribe to semantic kernel traces
.AddHttpClientInstrumentation(fun opts ->
opts.EnrichWithHttpRequestMessage <- Action<Activity,HttpRequestMessage>(enrichHttpRequest)
opts.EnrichWithHttpResponseMessage <- Action<Activity,HttpResponseMessage>(enrichHttpResponse))
.AddConsoleExporter()
.Build()
let loggerFactory =
LoggerFactory.Create(fun builder ->
builder
.SetMinimumLevel(LogLevel.Trace)
.AddOpenTelemetry(fun opts ->
opts.SetResourceBuilder(resourceBuilder) |> ignore
opts.AddConsoleExporter() |> ignore // export logs to console
opts.IncludeFormattedMessage <- true)
|> ignore)
let apiKey = Env.variable "OPENAI_API_KEY"
let builder = Kernel.CreateBuilder()
builder.Services.AddSingleton(loggerFactory)
builder.Services.AddOpenAIChatCompletion("gpt-3.5-turbo-1106", apiKey)
builder.Plugins.AddFromType<MathPlugin>()
let kernel = builder.Build()
let tracer = tracerProvider.GetTracer("Scratch")
let settings = OpenAIPromptExecutionSettings(ToolCallBehavior=ToolCallBehavior.AutoInvokeKernelFunctions)
let chatService = kernel.Services.GetRequiredService<IChatCompletionService>()
let history = ChatHistory("You are a friendly analyst. You can add numbers.")
history.AddUserMessage("What is 14.4 + 24.6?")
let span = tracer.StartActiveSpan("Solving math problem")
let res = chatService.GetChatMessageContentAsync(history, settings, kernel).Result
span.Dispose()
printfn $"Response: {res}" |
Beta Was this translation helpful? Give feedback.
-
If you'd like to view the raw prompt that's sent to Azure OpenAI, you will need to enable trace level logging as prompts and completions may contain PII. For more information, please see: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/docs/TELEMETRY.md#logging |
Beta Was this translation helpful? Give feedback.
-
this QA or issue is not solved,why closed ??? |
Beta Was this translation helpful? Give feedback.
-
Looks like the issue is fixed in latest version (validated in v 1.14.1). Thanks so much: @dmytrostruk & @stephentoub! |
Beta Was this translation helpful? Give feedback.
-
Running through the examples, I also wanted to inspect the hydrated prompt but found the API lacking. Here's my use case: Given this code from the memory example, const string skPrompt = @"
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.
Information about me, from previous conversations:
- {{$fact1}} {{recall $fact1}}
- {{$fact2}} {{recall $fact2}}
- {{$fact3}} {{recall $fact3}}
- {{$fact4}} {{recall $fact4}}
- {{$fact5}} {{recall $fact5}}
Chat:
{{$history}}
User: {{$userInput}}
ChatBot: ";
var chatFunction = kernel.CreateFunctionFromPrompt(skPrompt, new OpenAIPromptExecutionSettings { MaxTokens = 200, Temperature = 0.8 });
var arguments = new KernelArguments();
arguments["fact1"] = "what is my name?";
arguments["fact2"] = "where do I live?";
arguments["fact3"] = "where is my family from?";
arguments["fact4"] = "where have I travelled?";
arguments["fact5"] = "what do I do for work?";
arguments[TextMemoryPlugin.CollectionParam] = MemoryCollectionName;
arguments[TextMemoryPlugin.LimitParam] = "2";
arguments[TextMemoryPlugin.RelevanceParam] = "0.8";
var history = "";
arguments["history"] = history; I want to do something like Console.WriteLine(chatFunction.GetPrompt(kernel, arguments)); to inspect how the prompt renders while I iterate. For example, I might be unhappy with how history is formatted, or the memories returned from the vector store. I do this kind of inspection routinely when working with my own strings and it seems like an oversight that this should be any harder than above with SK, even more so if I need to make an LLM request to view it from telemetry. Working with notebooks, C# and SK was a magical experience until this moment. |
Beta Was this translation helpful? Give feedback.
As far as I can see, the main reason why developers use
IChatCompletionService
directly is because they want access toChatHistory
object directly, and Kernel doesn't work with this object yet. I believe if we add support for it, then there would be no reason to dokernel.GetRequiredService<IChatCompletionService>()
for simple cases. It will be possible to dokernel.InvokeAsync(chatHistory)
, and then all features, including extended telemetry, will work as expected, without need to duplicate prompt logging in all connectors. And we still can have logging for O…