-
Notifications
You must be signed in to change notification settings - Fork 76
How to Create Generative AI Assistant Distribution
Prompt-based control has become increasingly popular in the development of generative AI models, particularly for dialog systems. This approach adds a lightweight control layer on top of the generative model, making it easier to manage and fine-tune the model's responses.
At DeepPavlov, we have developed a new approach to building Generative AI Assistants using the DeepPavlov Dream platform with prompt-based control. In this How-To guide, we will introduce you to our approach and walk you through the steps to develop your own generative AI assistant.
- Prompts are used as a lightweight mechanism for controlling the behavior of the Generative AI Model. It is automatically added at the beginning of the input for a given model.
- Generative AI Model Service is a new service component designed to run a given Generative AI Model locally.
- Generative Skills are a type of skills of the DeepPavlov Dream platform that encapsulate generative AI models also known as Large Language Models (LLMs). Currently, these skills provide you the means to control the behavior of the generative AI models using prompt engineering.
- Generative AI Assistant Distributions are a type of AI Assistant distributions in the DeepPavlov Dream platform designed to use generative AI models to provide conversational experiences to the end-users through incorporation of the custom Generative Skills.
In this release, two methods for controlling Generative AI Models are provided:
- Each Generative Skill is controlled through the prompt unique to it. Prompt is automatically added at each conversation turn, allowing the developer to limit model's behavior to the provided prompt.
- Multiple Generative Skills can be used in the same Generative AI Assistant. DeepPavlov Dream Platform automatically calls the most relevant Generative Skill based on the similarity of the user's phrase to the given skill's prompt.
- If the user says something irrelevant to either of the known prompts, Generative AI Assistant will provide a fallback response using the Dummy Skill.
Note: In upcoming releases, we plan to enhance our skills by incorporating structured representations of episodic and working memory based on the dialog state and conversation history stored within DeepPavlov Dream. This will enable us to have tighter control of Language Model Models (LLMs). Additionally, we will be adding support for calling APIs on behalf of the user. We're excited to share these updates with you, so be sure to stay tuned for more information.
New subfolder, prompts
, has been added to the common
folder of the platform. To add new prompts to your Generative AI Assistant, add your prompts to common/prompts/
as the json
files.
Note: The name of the prompt
is used across the entire platform, so it should have only English alphabet characters (not symbols). In case you want to add a space to the prompt, like SpaceX Prompt
, use underscore, like this: spacex_prompt.json
. Avoid using hyphens (-
) in the prompt's name.
A new component, Generative AI Model Service, has been added to the platform, to abstract the method of calling Generative AI Models from the Generative Skills. In the current release, you can use any of the models available in the Transformers library.
For each Generative AI Model, you should add a separate entry to the docker-compose.override.yml
and dev.yml
files.
Note: Multiple Generative Skills can use the same Generative AI Model, so only add new instances of the Generative AI Model Service when your Generative Skills need to use different generative AI models. E.g., if the only model your Generative Skills use is GPT-J, then you should use the same Generative AI Model Service for all Generative Skills. If you use both GPT-J and BLOOM, then you should create two separate instances of the service, one for GPT-J, and one for BLOOM.
To edit the configuration of each of the Generative AI Model Services, change the environmental variables through the args
area of the given Generative AI Model Service's component in the docker-compose.override.yml
file:
-
PRETRAINED_MODEL_NAME_OR_PATH
: used to specify the name of the pre-trained Large Language Model; -
CONFIG_NAME
: used to specify the name of the config file of the selected pre-trained model in the Transformers library. Currently, you can use the pre-defined namegenerative_ai_model_params.json
. You do not have to change the name of the file; however, you can change its values liketemperature
and other params if needed.
Here is the list of models you can use:
Generative AI Model | PRETRAINED_MODEL_NAME_OR_PATH |
---|---|
GPT-J | EleutherAI/gpt-j-6B |
GPT-2 | gpt2 |
GPT-Neo 2.7B | EleutherAI/gpt-neo-2.7B |
OPT 6.7B | facebook/opt-6.7b |
BLOOMZ 3B | bigscience/bloomz-3b |
BLOOM 3B | bigscience/bloom-3b |
OPT 125M | facebook/opt-125m |
BLOOM 560M | bigscience/bloom-560m |
BLOOM 7B1 | bigscience/bloom-7b1 |
Example:
PRETRAINED_MODEL_NAME_OR_PATH: EleutherAI/gpt-j-6B
Prompt Selector is designed to support automatic switching between multiple Generative Skills. It is a light-weight container that utilizes the Sentence Ranker component to automatically match user's phrase with the known prompts.
To edit the configuration of the Prompt Selector, change the environmental variables through the args
area of the prompt-selector
component in the docker-compose.override.yml
file.
-
N_SENTENCES_TO_RETURN
: used to specify the number of sentences returned by Prompt Selector. Practically, this number limits the number of sentences returned by the Sentence Ranker, and is later used to automatically match prompts with the corresponding Generative Skills; -
PROMPTS_TO_CONSIDER
: used to specify the prompts Prompt Selector should match user phrase with. You should place names of theprompts
in a comma-delimited fashion, without spaces, like this:
PROMPTS_TO_CONSIDER: prompt_name_1,prompt_name_2
Each considered prompt should be placed in the dream/common/prompts/' folder as '<prompt_name>.json
.
Note: Ensure the consistency of your Generative AI Assistant: the number of the N_SENTENCES_TO_RETURN
parameter should not exceed both the number of the available prompts and the number of the Generative Skills. Failure to ensuring this consistency will lead to the unexpected behavior like Prompt Selector suggesting prompts for the Generative Skills not included in your Generative AI Assistant.
The updated Skill Selector is designed to support Generative AI Assistants by automatically picking the most relevant Generative Skill through matching user's phrase with the known prompts.
It uses the aforementioned common/prompts/
folder to find the known prompts list, and automatically creates a list of Generative Skills by replacing the prompt_name
with the name of the selected prompt
in the skill's name. For example, if there is a SpaceX Generative Skill and its prompt's name is "spacex" then Skill Selector will automatically replace 'prompt_name' with 'spacex' as follows:
dff_{prompt_name}_prompted_skill
--> dff_spacex_prompted_skill
Generative Skill is a new kind of the Skills in the DeepPavlov Dream Platform.
Note: In the DeepPavlov Dream platform, there are several layers of abstraction. Folders in the corresponding component folders (annotators
, skills
etc.) represent individual components. Docker Compose files define containers that incorporate files in the aforementioned component folders into the services that are used in the pipeline_conf.json
by DeepPavlov Agent in the specified order to control the asynchronous pipeline.
While typically each Skill in the DeepPavlov Dream AI Assistant distributions has its own folder, Generative Skills currently differ only by the prompts and the Generative AI models they use to generate responses. Hence, there is currently only one Generative Skill known as dff_template_prompted_skill
stored in the skills
directory of the repository.
However, while the code folder is the same, each individual Generative Skill has to be defined separately in the docker-compose.override.yml
and dev.yml
files.
To edit the configuration of the given Generative Skill, change the environmental variables through the args
area of the Generative Skill's entry in the docker-compose.override.yml
file.
-
PROMPT_FILE
: used to specify the name of the prompt file stored in thecommon/prompts/
folder; -
GENERATIVE_SERVICE_URL
: used to specify the name of the Generative AI Model used by the given Generative Skill; -
GENERATIVE_TIMEOUT
: used to specify the timeout after which the skill will stop expecting the response from the target Generative AI Model and gracefully fail; -
-
N_UTTERANCES_CONTEXT
: used to specify the length of the dialog context provided to the Generative AI Model in addition to the prompt. It is a number of dialog phrases (utterances or user turns).
-
Thus, to add a new Generative Skill to your Generative AI Assistant distribution, you do not have to add new folder. Instead, you should add a new entry to both docker-compose.override.yml
and dev.yml
files of your Generative AI Assistant distribution.
In addition to the limitations common for Generative AI models, this release has the following limitations:
- In this release, you will need a machine with the powerful-enough GPU to run Generative AI Models like GPT-J locally; In future releases, we plan to introduce a mechanism for using OpenAI models such as GPT-3, which will eliminate the need for a powerful GPU. However, please note that using OpenAI models will require a separate payment to OpenAI;
- Same prompt is used during the entire dialog for each Generative Skill;
- If the conversation significantly deviates from the original prompt for any of the Generative Skills, the platform may stop invoking the associated Generative Skills and provide fallback responses;
- When using the Prompt Selector, it's important to ensure that the number of prompts provided corresponds to the number of Generative Skills in your Generative AI Assistant. If you provide more prompts than available Generative Skills, the Skill Selector will automatically include the names of the Generative Skills specific for each prompt, even if the corresponding skills are unavailable. This can cause the system to provide fallback responses using its Fallback Skill ("Dummy");
- To prevent issues with the system, it's important to name prompt files using lowercase letters, avoiding spaces and special symbols, and using underscores
_
instead of hyphens-
.
As part of this release, we are providing a template Generative AI Assistant distribution called the "Dream Persona Prompted Distribution." This example demonstrates how a Generative AI Assistant can be created on top of DeepPavlov Dream platform, featuring two distinct skills: a Generative Skill called "DFF Dream Persona Prompted Skill" and a fallback skill called "Dummy," which is used when the Generative Skill cannot provide a response to the user.
You can use this distribution as a template from which you can build your own Generative AI Assistant Distribution. You can have just one Generative Skill, or you can add several Generative Skills. The DeepPavlov Dream platform has been updated to automatically support switching between multiple Generative Skills by introducing a new component called Prompt Selector and by enhancing the existing Skill Selector to automatically picking the most relevant Generative Skills by matching user's utterance with the prompts of the included Generative Skills.
Dream Prompted Generative AI Assistant distribution is an example of the prompt-based Generative AI Assistant dialog system which contains one prompt-based Generative Skill.
This distribution contains the following skills:
-
Fallback Skill called Dummy Skill (
dummy_skill
) is a fallback skill; -
Generative Skill called DFF Dream Persona Prompted Skill (
dff_dream_persona_prompted_skill
), a skill created using DeepPavlov DFF (Dialog Flow Framework) which generates a response to the current dialog context taking into account the given prompt, i.g., bot's persona description.
The DFF Dream Persona Prompted Skill is a light-weight container sending requests to the Generative AI Model Service which hosts a neural network to provide prompt-based response generation. Per design of the Generative Skills, the DFF Dream Persona Prompted Skill accepts these environmental variables:
-
SERVICE_PORT
defines the port used by the service; it should be the same as indev.yml
; -
SERVICE_NAME
contains the name of the service used by the platform, e.g.:dff_dream_persona_prompted_skill
; -
PROMPT_FILE
contains a path to a JSON file containing dictionary with prompt; -
GENERATIVE_SERVICE_URL
contains a URL of the generative service to be used; -
GENERATIVE_TIMEOUT
defines the timeout after reaching which the Generative Skill will stop waiting for response from the Generative AI Model Service specified in theGENERATIVE_SERVICE_URL
above, and gradually fail. The service must utilize the same input-output format as Transformers-LM (transformers_lm
). -
N_UTTERANCES_CONTEXT
contains lengths of the considered context in terms of number of dialog utterances.
Note: DFF Dream Persona Prompted Skill utilizes a special universal template skills/dff_template_prompted_skill
which does not require creation of the new skill's directory. For your convenience, creating a new skill, you should utilize the same template folder but specify another prompt file, service port, and specify another container name.
This sample distribution includes one Prompt Selector.
Note: In this distribution's docker-compose.override.yml
we specify a list of 2 prompts to the Prompt Selector: dream_persona,pizza
. This is done just for the demonstration of the PROMPTS_TO_CONSIDER
's input format. As this distribution contains only one Generative Skill which utilizes Dream Persona prompt (dream_persona
), please remove ,pizza
from the docker-compose.override.yml
prior to running this distribution.
You should not do any changes in the Skill Selector, it works automatically.
docker-compose -f docker-compose.yml -f assistant_dists/dream_persona_prompted/docker-compose.override.yml -f assistant_dists/dream_persona_prompted/dev.yml -f assistant_dists/dream_persona_prompted/proxy.yml up --build
Note: In this release, you will need a machine with the powerful-enough GPU to run Generative AI Models like GPT-J locally.
docker-compose exec agent python -m deeppavlov_agent.run agent.channel=cmd agent.pipeline_config=assistant_dists/dream_persona_prompted/pipeline_conf.json
If you want to create a new Generative AI Assistant distribution (distribution containing prompt-based Generative Skill(s)), follow these instructions:
- Create genassistants folder in the folder of your choice:
mkdir genassistants
- Move to it:
cd genassistants
- Clone dreamtools to your local machine:
git clone https://github.com/deeppavlov/deeppavlov_dreamtools
- Move to their folder:
cd deeppavlov_dreamtools
- Install dreamtools using pip3:
pip3 install -e .
- Move back to the genassistants folder:
cd ..
- Clone dream:
git clone https://github.com/deeppavlov/dream
- Move to its folder:
cd dream
- Check that dreamtools work:
dreamtools
. You should get a message starting withUsage: dreamtools [OPTIONS] COMMAND [ARGS] ...
.
Note: In this How-To we will create a SpaceX Generative AI Assistant, with the name spacex_ga_assistant
.
- Use dreamtools to create your own Generative AI Assistant distribution by using the template provided above, where spacex_ga_assistant is the name of your new distribution, and dream_persona_prompted is the template distribution's name:
dreamtools clone dist spacex_ga_assistant --template dream_persona_prompted –display-name “My SpaceX AI Assistant” –description “You could have placed an ad here”
- Create a new prompt and place it in the
common/prompts/<prompt_name>.json
. Note: In this How-To we will use an existing promptspacex.json
. - In
docker-compose.override.yml
findprompt-selector
, and in itsargs
changePROMPTS_TO_CONSIDER
value tospacex
. - In
docker-compose.override.yml
findprompt-selector
, and in itsargs
changeN_SENTENCES_TO_RETURN
to the number of prompted Generative Skills, e.g.,1
, as you have just one such skill.
- Create new prompts for each of your skills and place them in the
common/prompts/
subfolder. Name of theseprompts
should be used to name their corresponding Generative Skills. It is advised to avoid using hyphens (-
) or underscores (_
) to make maintenance of the whole system easier and less error-prone. Note: In this How-To we will use existing prompt filesspacex.json
,pizza.json
, anddream_persona.json
. - In
docker-compose.override.yml
findprompt-selector
, and in itsargs
changePROMPTS_TO_CONSIDER
value tospacex,pizza,dream_persona
. - In
docker-compose.override.yml
findprompt-selector
, and in itsargs
changeN_SENTENCES_TO_RETURN
to the number of prompted Generative Skills, e.g.,3
, as you have 3 such skills.
If you want to have just one Generative Skill, you can re-use the existing skill, and then change the file with the prompt, and use that to run your solution.
To add more Generative Skills, currently you have to make changes to several configuration files. Note: In the future releases, we plan to update our dreamtools to make adding new Generative Skills easier for you. Stay tuned!
Go to your new distribution's subfolder in assistant_dists
, like this:
cd assistant_dists/spacex_ga_assistant
(the one we created above).
Then, for each of your new *Generative Skills, follow these instructions:
- Copy & paste these lines to create new Generative Skills:
dff-dream-persona-prompted-skill:
volumes:
- "./skills/dff_template_prompted_skill:/src"
- "./common:/src/common"
ports:
- 8134:8134
- Replace
dff-dream-persona-prompted-skill
with the name of your new Generative Skills based on the names of theprompts
you've created above. Note that Skill Selector automatically picks new Generative Skills by replacing<prompt_name>
in thedff-<prompt_name>-prompted-skill
Note: In this How-To we use existing prompt filesspacex.json
,pizza.json
, anddream_persona.json
. This means that you need to add just two more blocks describing your Generative Skills given that thedream-persona
skill has been already added to your Distribution at its creation. The names of your new Generative Skills should be as following:dff-spacex-prompted-skill
anddff-pizza-prompted-skill
. - Replace
SERVICE_NAME
- Copy & paste these lines to create new Generative Skills:
dff-dream-persona-prompted-skill:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8134
SERVICE_NAME: dff_dream_persona_prompted_skill
PROMPT_FILE: common/prompts/dream_persona.json
GENERATIVE_SERVICE_URL: http://transformers-lm-gptj:8130/respond
GENERATIVE_TIMEOUT: 8
N_UTTERANCES_CONTEXT: 3
context: .
dockerfile: ./skills/dff_template_prompted_skill/Dockerfile
command: gunicorn --workers=1 server:app -b 0.0.0.0:8134 --reload
deploy:
resources:
limits:
memory: 128M
reservations:
memory: 128M
- For each of the skills you've defined in the dev.yml, replace these
dff-dream-persona-prompted-skill
with the name of your new Generative Skills. Note: In this How-To we use existing prompt filesspacex.json
,pizza.json
, anddream_persona.json
. This means that you need to add just two more blocks describing your Generative Skills given that thedream-persona
skill has been already added to your Distribution at its creation. The names of your new Generative Skills should be as following:dff-spacex-prompted-skill
anddff-pizza-prompted-skill
. - For each of the skills, change the
SERVICE_NAME
to reflect service's name, so, for example, fordff-spacex-prompted-skill
you should specifySERVICE_NAME: dff_spacex_prompted_skill
. _Note: InSERVICE_NAME
we use underscores (_
) while in the container names in the Docker Compose files we use hyphens (-
). It is imperative to follow this notation. - For each of the skills, change the
SERVICE_PORT
to reflect service's port specified in thedev.yml
. - For each of the skills, specify their respective
PROMPT_FILE
by setting param's value to the relative path to the prompt file, e.g., fordff-spacex-prompted-skill
setPROMPT_FILE
tocommon/prompts/spacex.json
. - For each of the skills, change the port in the
command
line to reflect service's port specified in theSERVICE_PORT
value above. - If you use different Generative AI Model Services, specify its name in the
GENERATIVE_SERVICE_URL
.
- For each of the skills, duplicate function:
def dff_dream_persona_prompted_skill_formatter(dialog): return utils.dff_formatter( dialog, "dff_dream_persona_prompted_skill", types_utterances=["human_utterances", "bot_utterances", "utterances"] )
- Replace
dream_persona
with the name of that skill in this function (two places), e.g., fordff_spacex_prompted_skill
, replacedream_persona
withspacex
.
- For each of your Generative Skills, copy & paste the section that describes the original DFF Dream Persona Prompted Skill:
"dff_dream_persona_prompted_skill": { "connector": { "protocol": "http", "timeout": 4.5, "url": "http://dff-dream-persona-prompted-skill:8134/respond" }, "dialog_formatter": "state_formatters.dp_formatters:dff_dream_persona_prompted_skill_formatter", "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", "previous_services": [ "skill_selectors" ], "state_manager_method": "add_hypothesis" },
- replace strings
dream-persona
with<prompt-name>
(container names are using hyphens) anddream_persona
with<prompt_name>
(component names are using underscores). It will change the container name, skill name, formatter name - replace port (
8134
in the example) to the assigned one indream/assistant_dists/dream_custom_prompted/docker-compose.override.yml
. - If one does not want to keep DFF Dream Persona Prompted Skill in their distribution, one should remove all mentions
of DFF Dream Persona Prompted Skill container from
yml
-configs andpipeline_conf.json
files.
Note: Please, take into account that naming skill utilizing <prompt_name>
according to the instruction above is very important to provide Skill Selector automatically turn on the prompt-based skills which are returned as N_SENTENCES_TO_RETURN
the most relevant prompts.
Note: In this example, we use the spacex_ga_assistant
name:
docker-compose -f docker-compose.yml -f assistant_dists/spacex_ga_assistant/docker-compose.override.yml -f assistant_dists/spacex_ga_assistant/dev.yml -f assistant_dists/spacex_ga_assistant/proxy.yml up --build
docker-compose exec agent python -m deeppavlov_agent.run agent.channel=cmd agent.pipeline_config=assistant_dists/spacex_ga_assistant/pipeline_conf.json
Note: Make sure to replace spacex_ga_assistant
with the name of your Generative AI Assistant distribution.