-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calibrated prompt generated is completely different from initial prompt. #57
Comments
Hi,
Lastly, it's important to note that you are using LLM ranker, so you need to skip the ranker training process. AutoPrompt/run_generation_pipeline.py Line 53 in 7f373f2
To: generation_config_params.eval.function_params.instruction = ranker_config_params.annotator.config.instruction
|
Thank you for your reply.
You mentioned above that ranking training process needs to be skipped as
llm ranker is used.
Could you please elaborate why ranker training is not needed. I don’t get
it.
Regards,
Shobhna
…On Thu, 4 Apr 2024 at 1:48 AM, Eladlev ***@***.***> wrote:
Hi,
Currently, AutoPrompt doesn't support such complex prompts.
The way to do it is to decompose this complex prompt into smaller prompts
with specific sub-tasks and optimize them.
I recommend you look at the dspy
<https://github.com/stanfordnlp/dspy/tree/main> framework, this
framework is also focuses on the few-shot setting (choosing the best
few-shot samples).
In the future, we aim to add support to such complex prompts (by breaking
them down into sub-tasks and doing global optimization on all the
components as in dspy).
All the best,
Elad.
On Wed, Apr 3, 2024 at 11:02 PM shobhna ***@***.***> wrote:
> Thank you for your reply.
> Please help me understand few things:
> 1. I want to optimize the whole initial prompt as given in the example.
My
> understanding is that if I move some part to task description, then only
> text given in --prompt field as input will be optimized . Please correct
> me
> if I am wrong.
>
> This is a very simple prompt with which I am trying to understand how
> AutoPrompt works. In reality we have a very complex prompt in which we
are
> explaining schema of various table , table data and relation between the
> tables.
> We are also providing few shot examples to make LLM understand what
output
> is expected and how sql query should be formed.
> We aim to use AutoPrompt to optimize whole prompt having schema and few
> shot examples.
> How should I proceed with a complex prompt?
>
> Regards,
> Shobhna
>
> On Thu, 4 Apr 2024 at 12:43 AM, Eladlev ***@***.***> wrote:
>
> > Hi,
> > Few remarks:
> >
> > 1. It seems that your ranker is too 'weak', he only requires that the
> > SQL query be relevant, so it is very hard to generate synthetic data
on
> > which GPT-3.5/4 will fail (on any relevant prompt).
> > 2. In order to avoid divergence you should move some of the initial
> > prompt to the task description.
> >
> > For example:
> > --task_description:
> > Assistant is a large language model that is tasked to generate SQL
query
> based on details and examples provided in prompt.
> > We have 2 tables:
> > Employee: Employee table have information regarding all the employees
in
> a company.
> > Below are the attributes of Employee table
> > empid: empid column contains employee id. empid is a primary key of
> Employee table.
> > name: Name column contains name of the employee
> > salary: salary column contains salary of the employee
> > department_id: department_id column contains employee's department id.
> It is a foreign key from Department table.
> > Department: Department table have information regarding all the
> department of a company.
> > Below are the attributes of Department table
> > department_id: department_id contains the id of the department.
> department_id is primary key of Department table.
> > department_name: department_name contains name of the department.
> >
> > --prompt
> > Your task is to generate SQL query from natural language input
provided
> by user.
> > Your task is to understand natural language input and provide SQL
query
> to fetch information asked in natural language input from above tables.
> >
> > Lastly, it's important to note that you are using LLM ranker, so you
> need
> > to skip the ranker training process.
> > You should change this line:
> >
> >
>
https://github.com/Eladlev/AutoPrompt/blob/7f373f219aa360cd2de38c6aa700c1dff282d7de/run_generation_pipeline.py#L53
> > To:
> > generation_config_params.eval.function_params.instruction =
> > ranker_config_params.annotator.config.instruction
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <
#57 (comment)>,
>
> > or unsubscribe
> > <
>
https://github.com/notifications/unsubscribe-auth/AGADJH2JWGOIH4BIGFQ6BKLY3RIERAVCNFSM6AAAAABFVPHC4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZVGM4DSOJUGQ>
>
> > .
> > You are receiving this because you authored the thread.Message ID:
> > ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub
> <#57 (comment)>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AG5EGKGP242VRVP5ANBKM4LY3RN7DAVCNFSM6AAAAABFVPHC4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZVGQ3TOMBVGY>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#57 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGADJHYMWYGC3PHP56YK6QLY3RPYFAVCNFSM6AAAAABFVPHC4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZVGUYDIMBVHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The purpose of the first phase is to fit an LLM ranker according to the user intent (by treating it as a classification task), this saves lots of human effort since the alternative is that the user will rank the whole dataset at each phase. If you start already with an LLM ranker, then the first stage (fitting an LLM ranker) can only result in a sub-optimal approximation, so it's better to skip this step and use directly the LLM ranker. |
I have a prompt which is used to generate sql query from the input text given by a user.
I am trying to optimize prompt using run_generation_pipeline.py, but I am getting completely different Calibrated prompt.
Below are the inputs provided:
output given by AutoPrompt:
Output given is not relevant to the task. Am I providing the wrong inputs or missing some inputs that needs to be provided?
The text was updated successfully, but these errors were encountered: