Skip to content

Commit

Permalink
Unreverse prompts and introduce relevance option (not hardcoded) (#28)
Browse files Browse the repository at this point in the history
Motivation and Context (Why the change? What's the scenario?)

    Prompts got shuffled up in all the code moves.
    Customers changing relevancy setting as they experiment with sandbox

High level description (Approach, Design)

    Unreverse prompts
    Make relevancy a configurable option (not hardcoded): MinSchemaRelevance
  • Loading branch information
crickman authored Aug 23, 2023
1 parent 4aa6660 commit b995086
Show file tree
Hide file tree
Showing 8 changed files with 73 additions and 62 deletions.
2 changes: 2 additions & 0 deletions .github/_typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ extend-exclude = [
"package-lock.json",
"*.bicep",
"*.sql",
"vocab.bpe",
"encoder.json"
]

[default.extend-words]
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ running the service locally with OpenAPI enabled.
3. [Using the Semantic Memory web service](examples/002-dotnet-WebClient)
4. [How to upload files from command line with curl](examples/003-curl-calling-webservice)
5. [Processing files with custom steps](examples/004-dotnet-ServerlessCustomPipeline)
6. [Using a custom pipeline handler with serveless memory class](examples/005-dotnet-InProcessMemoryWithCustomHandler)
6. [Using a custom pipeline handler with serverless memory class](examples/005-dotnet-InProcessMemoryWithCustomHandler)
6. [Writing a custom async pipeline handler](examples/006-dotnet-CustomHandlerAsAService)
## Tools
Expand Down
2 changes: 1 addition & 1 deletion dotnet/Service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ The service depends on three main components:
* **Embedding generator**: all the documents uploaded are automatically
partioned (aka "chunked") and indexed for vector search, generating
partitioned (aka "chunked") and indexed for vector search, generating
several embedding vectors for each file. We recommend using
[OpenAI ADA v2](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings)
model, though you can easily plug in any embedding generator if needed.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,59 @@
If the requested OBJECTIVE can be answered by querying a database with tables described in SCHEMA, ANSWER: YES.
Otherwise ANSWER: NO.
Generate a SQL SELECT query that is compatible with {{$data_platform}} and achieves the OBJECTIVE exclusively using only the tables and views described in "SCHEMA:".

Do not answer with any other word than YES or NO.
Only generate SQL if the OBJECTIVE can be answered by querying a database with tables described in SCHEMA.

Do not include any explanations, only provide valid SQL.

[BEGIN EXAMPLE]

SCHEMA:
description: historical record of concerts, stadiums and singers
tables:
- stadium:
columns:
*: all columns
Stadium_ID:
Location:
Name:
Capacity:
Highest:
Lowest:
Average:
- singer:
columns:
*: all columns
Singer_ID:
Name:
Country:
Song_Name:
Song_release_year:
Age:
Is_male:
- concert:
columns:
*: all columns
concert_ID:
concert_Name:
Theme:
Stadium_ID:
Year:
- singer_in_concert:
columns:
*: all columns
concert_ID:
Singer_ID:
references:
concert.Stadium_ID: stadium.Stadium_ID
singer_in_concert.concert_ID: concert.concert_ID
singer_in_concert.Singer_ID: singer.Singer_ID

OBJECTIVE: "How many heads of the departments are older than 56 ?"
SQL: select count(*) department_head_count from head where age > 56

[END EXAMPLE]

SCHEMA:
{{$data_schema}}

OBJECTIVE: {{$data_objective}}

ANSWER: Let's think step by step.
SQL: Let's think step by step.
Original file line number Diff line number Diff line change
@@ -1,59 +1,11 @@
Generate a SQL SELECT query that is compatible with {{$data_platform}} and achieves the OBJECTIVE exclusively using only the tables and views described in "SCHEMA:".
If the requested OBJECTIVE can be answered by querying a database with tables described in SCHEMA, ANSWER: YES.
Otherwise ANSWER: NO.

Only generate SQL if the OBJECTIVE can be answered by querying a database with tables described in SCHEMA.

Do not include any explanations, only provide valid SQL.

[BEGIN EXAMPLE]

SCHEMA:
description: historical record of concerts, stadiums and singers
tables:
- stadium:
columns:
*: all columns
Stadium_ID:
Location:
Name:
Capacity:
Highest:
Lowest:
Average:
- singer:
columns:
*: all columns
Singer_ID:
Name:
Country:
Song_Name:
Song_release_year:
Age:
Is_male:
- concert:
columns:
*: all columns
concert_ID:
concert_Name:
Theme:
Stadium_ID:
Year:
- singer_in_concert:
columns:
*: all columns
concert_ID:
Singer_ID:
references:
concert.Stadium_ID: stadium.Stadium_ID
singer_in_concert.concert_ID: concert.concert_ID
singer_in_concert.Singer_ID: singer.Singer_ID

OBJECTIVE: "How many heads of the departments are older than 56 ?"
SQL: select count(*) department_head_count from head where age > 56

[END EXAMPLE]
Do not answer with any other word than YES or NO.

SCHEMA:
{{$data_schema}}

OBJECTIVE: {{$data_objective}}
SQL: Let's think step by step.

ANSWER: Let's think step by step.
6 changes: 5 additions & 1 deletion examples/200-dotnet-nl2sql/nl2sql.console/Nl2SqlConsole.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel;
Expand Down Expand Up @@ -38,13 +39,16 @@ internal sealed class Nl2SqlConsole : BackgroundService

public Nl2SqlConsole(
IKernel kernel,
IConfiguration config,
SqlConnectionProvider sqlProvider,
ILogger<Nl2SqlConsole> logger)
{
var minRelevance = config.GetValue<double>("MinSchemaRelevance", SqlQueryGenerator.DefaultMinRelevance);

this._kernel = kernel;
this._sqlProvider = sqlProvider;
this._logger = logger;
this._queryGenerator = new SqlQueryGenerator(this._kernel, Repo.RootConfigFolder);
this._queryGenerator = new SqlQueryGenerator(this._kernel, Repo.RootConfigFolder, minRelevance);
}

protected override async Task ExecuteAsync(CancellationToken stoppingToken)
Expand Down
3 changes: 3 additions & 0 deletions examples/200-dotnet-nl2sql/nl2sql.console/appsettings.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
{
// Semantic relevancy threshold for selecting schema
"MinSchemaRelevance": 0.7,
// Logging options
"Logging": {
"LogLevel": {
"Default": "Trace"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ namespace SemanticKernel.Data.Nl2Sql.Library;
/// </summary>
public sealed class SqlQueryGenerator
{
public const double DefaultMinRelevance = 0.7D;

public const string ContextParamObjective = "data_objective";
public const string ContextParamSchema = "data_schema";
public const string ContextParamSchemaId = "data_schema_id";
Expand All @@ -35,7 +37,7 @@ public sealed class SqlQueryGenerator
private readonly ISKFunction _promptGenerator;
private readonly ISemanticTextMemory _memory;

public SqlQueryGenerator(IKernel kernel, string rootSkillFolder)
public SqlQueryGenerator(IKernel kernel, string rootSkillFolder, double minRelevanceScore = DefaultMinRelevance)
{
var functions = kernel.ImportSemanticSkillFromDirectory(rootSkillFolder, SkillName);
this._promptEval = functions["isquery"];
Expand Down

0 comments on commit b995086

Please sign in to comment.