Skip to content

How to Create Generative AI Assistant Distribution

Daniel Kornev edited this page Mar 13, 2023 · 20 revisions

Generative AI Assistant Distributions

Prompt-based control has become increasingly popular in the development of generative AI models, particularly for dialog systems. This approach adds a lightweight control layer on top of the generative model, making it easier to manage and fine-tune the model's responses.

At DeepPavlov, we have developed a new approach to building Generative AI Assistants using the DeepPavlov Dream platform with prompt-based control. In this How-To guide, we will introduce you to our approach and walk you through the steps to develop your own generative AI assistant.

Main Concepts

  • Prompts are used as a lightweight mechanism for controlling the behavior of the Generative AI Model. It is automatically added at the beginning of the input for a given model.
  • Generative AI Model Service is a new service component designed to run a given Generative AI Model locally.
  • Generative Skills are a type of skills of the DeepPavlov Dream platform that encapsulate generative AI models also known as Large Language Models (LLMs). Currently, these skills provide you the means to control the behavior of the generative AI models using prompt engineering.
  • Generative AI Assistant Distributions are a type of AI Assistant distributions in the DeepPavlov Dream platform designed to use generative AI models to provide conversational experiences to the end-users through incorporation of the custom Generative Skills.

Methods for Controlling Generative AI Models

In this release, two methods for controlling Generative AI Models are provided:

  • Each Generative Skill is controlled through the prompt unique to it. Prompt is automatically added at each conversation turn, allowing the developer to limit model's behavior to the provided prompt.
  • Multiple Generative Skills can be used in the same Generative AI Assistant. DeepPavlov Dream Platform automatically calls the most relevant Generative Skill based on the similarity of the user's phrase to the given skill's prompt.
  • If the user says something irrelevant to either of the known prompts, Generative AI Assistant will provide a fallback response using the Dummy Skill.

Note: In upcoming releases, we plan to enhance our skills by incorporating structured representations of episodic and working memory based on the dialog state and conversation history stored within DeepPavlov Dream. This will enable us to have tighter control of Language Model Models (LLMs). Additionally, we will be adding support for calling APIs on behalf of the user. We're excited to share these updates with you, so be sure to stay tuned for more information.

Generative AI Assistant Platform Components

Code

Prompts

New subfolder, prompts, has been added to the common folder of the platform. To add new prompts to your Generative AI Assistant, add your prompts to common/prompts/ as the json files.

Note: The name of the prompt is used across the entire platform, so it should have only English alphabet characters (not symbols). In case you want to add a space to the prompt, like SpaceX Prompt, use underscore, like this: spacex_prompt.json. Avoid using hyphens (-) in the prompt's name.

Services

Generative AI Model Service

A new component, Generative AI Model Service, has been added to the platform, to abstract the method of calling Generative AI Models from the Generative Skills. In the current release, you can use any of the models available in the Transformers library.

For each Generative AI Model, you should add a separate entry to the docker-compose.override.yml and dev.yml files.

Note: Multiple Generative Skills can use the same Generative AI Model, so only add new instances of the Generative AI Model Service when your Generative Skills need to use different generative AI models. E.g., if the only model your Generative Skills use is GPT-J, then you should use the same Generative AI Model Service for all Generative Skills. If you use both GPT-J and BLOOM, then you should create two separate instances of the service, one for GPT-J, and one for BLOOM.

Configurable Parameters:

To edit the configuration of each of the Generative AI Model Services, change the environmental variables through the args area of the given Generative AI Model Service's component in the docker-compose.override.yml file:

  • PRETRAINED_MODEL_NAME_OR_PATH: used to specify the name of the pre-trained Large Language Model;
  • CONFIG_NAME: used to specify the name of the config file of the selected pre-trained model in the Transformers library. Currently, you can use the pre-defined name generative_ai_model_params.json. You do not have to change the name of the file; however, you can change its values like temperature and other params if needed.

Here is the list of models you can use:

Generative AI Model PRETRAINED_MODEL_NAME_OR_PATH
GPT-J EleutherAI/gpt-j-6B
GPT-2 gpt2
GPT-Neo 2.7B EleutherAI/gpt-neo-2.7B
OPT 6.7B facebook/opt-6.7b
BLOOMZ 3B bigscience/bloomz-3b
BLOOM 3B bigscience/bloom-3b
OPT 125M facebook/opt-125m
BLOOM 560M bigscience/bloom-560m
BLOOM 7B1 bigscience/bloom-7b1

Example: PRETRAINED_MODEL_NAME_OR_PATH: EleutherAI/gpt-j-6B

Annotators

Prompt Selector

Prompt Selector is designed to support automatic switching between multiple Generative Skills. It is a light-weight container that utilizes the Sentence Ranker component to automatically match user's phrase with the known prompts.

Configurable Parameters:

To edit the configuration of the Prompt Selector, change the environmental variables through the args area of the prompt-selector component in the docker-compose.override.yml file.

  • N_SENTENCES_TO_RETURN: used to specify the number of sentences returned by Prompt Selector. Practically, this number limits the number of sentences returned by the Sentence Ranker, and is later used to automatically match prompts with the corresponding Generative Skills;
  • PROMPTS_TO_CONSIDER: used to specify the prompts Prompt Selector should match user phrase with. You should place names of the prompts in a comma-delimited fashion, without spaces, like this:

PROMPTS_TO_CONSIDER: prompt_name_1,prompt_name_2

Each considered prompt should be placed in the dream/common/prompts/' folder as '<prompt_name>.json.

Note: Ensure the consistency of your Generative AI Assistant: the number of the N_SENTENCES_TO_RETURN parameter should not exceed both the number of the available prompts and the number of the Generative Skills. Failure to ensuring this consistency will lead to the unexpected behavior like Prompt Selector suggesting prompts for the Generative Skills not included in your Generative AI Assistant.

Skill Selectors

(Generative) Skill Selector

The updated Skill Selector is designed to support Generative AI Assistants by automatically picking the most relevant Generative Skill through matching user's phrase with the known prompts.

It uses the aforementioned common/prompts/ folder to find the known prompts list, and automatically creates a list of Generative Skills by replacing the prompt_name with the name of the selected prompt in the skill's name. For example, if there is a SpaceX Generative Skill and its prompt's name is "spacex" then Skill Selector will automatically replace 'prompt_name' with 'spacex' as follows:

dff_{prompt_name}_prompted_skill --> dff_spacex_prompted_skill

Skills

Generative Skill

Generative Skill is a new kind of the Skills in the DeepPavlov Dream Platform.

Note: In the DeepPavlov Dream platform, there are several layers of abstraction. Folders in the corresponding component folders (annotators, skills etc.) represent individual components. Docker Compose files define containers that incorporate files in the aforementioned component folders into the services that are used in the pipeline_conf.json by DeepPavlov Agent in the specified order to control the asynchronous pipeline.

While typically each Skill in the DeepPavlov Dream AI Assistant distributions has its own folder, Generative Skills currently differ only by the prompts and the Generative AI models they use to generate responses. Hence, there is currently only one Generative Skill known as dff_template_prompted_skill stored in the skills directory of the repository.

However, while the code folder is the same, each individual Generative Skill has to be defined separately in the docker-compose.override.yml and dev.yml files.

Configurable Parameters

To edit the configuration of the given Generative Skill, change the environmental variables through the args area of the Generative Skill's entry in the docker-compose.override.yml file.

  • PROMPT_FILE: used to specify the name of the prompt file stored in the common/prompts/ folder;
  • GENERATIVE_SERVICE_URL: used to specify the name of the Generative AI Model used by the given Generative Skill;
  • GENERATIVE_TIMEOUT: used to specify the timeout after which the skill will stop expecting the response from the target Generative AI Model and gracefully fail;
    • N_UTTERANCES_CONTEXT: used to specify the length of the dialog context provided to the Generative AI Model in addition to the prompt. It is a number of dialog phrases (utterances or user turns).

Thus, to add a new Generative Skill to your Generative AI Assistant distribution, you do not have to add new folder. Instead, you should add a new entry to both docker-compose.override.yml and dev.yml files of your Generative AI Assistant distribution.

Limitations

In addition to the limitations common for Generative AI models, this release has the following limitations:

  • In this release, you will need a machine with the powerful-enough GPU to run Generative AI Models like GPT-J locally; In future releases, we plan to introduce a mechanism for using OpenAI models such as GPT-3, which will eliminate the need for a powerful GPU. However, please note that using OpenAI models will require a separate payment to OpenAI;
  • Same prompt is used during the entire dialog for each Generative Skill;
  • If the conversation significantly deviates from the original prompt for any of the Generative Skills, the platform may stop invoking the associated Generative Skills and provide fallback responses;
  • When using the Prompt Selector, it's important to ensure that the number of prompts provided corresponds to the number of Generative Skills in your Generative AI Assistant. If you provide more prompts than available Generative Skills, the Skill Selector will automatically include the names of the Generative Skills specific for each prompt, even if the corresponding skills are unavailable. This can cause the system to provide fallback responses using its Fallback Skill ("Dummy");
  • To prevent issues with the system, it's important to name prompt files using lowercase letters, avoiding spaces and special symbols, and using underscores _ instead of hyphens -.

Examples

As part of this release, we are providing a template Generative AI Assistant distribution called the "Dream Persona Prompted Distribution." This example demonstrates how a Generative AI Assistant can be created on top of DeepPavlov Dream platform, featuring two distinct skills: a Generative Skill called "DFF Dream Persona Prompted Skill" and a fallback skill called "Dummy," which is used when the Generative Skill cannot provide a response to the user.

You can use this distribution as a template from which you can build your own Generative AI Assistant Distribution. You can have just one Generative Skill, or you can add several Generative Skills. The DeepPavlov Dream platform has been updated to automatically support switching between multiple Generative Skills by introducing a new component called Prompt Selector and by enhancing the existing Skill Selector to automatically picking the most relevant Generative Skills by matching user's utterance with the prompts of the included Generative Skills.

Dream Prompted Distribution

Dream Prompted Generative AI Assistant distribution is an example of the prompt-based Generative AI Assistant dialog system which contains one prompt-based Generative Skill.

This distribution contains the following skills:

  • Fallback Skill called Dummy Skill (dummy_skill) is a fallback skill;
  • Generative Skill called DFF Dream Persona Prompted Skill (dff_dream_persona_prompted_skill), a skill created using DeepPavlov DFF (Dialog Flow Framework) which generates a response to the current dialog context taking into account the given prompt, i.g., bot's persona description.

DFF Dream Persona Prompted Skill

The DFF Dream Persona Prompted Skill is a light-weight container sending requests to the Generative AI Model Service which hosts a neural network to provide prompt-based response generation. Per design of the Generative Skills, the DFF Dream Persona Prompted Skill accepts these environmental variables:

  • SERVICE_PORT defines the port used by the service; it should be the same as in dev.yml;
  • SERVICE_NAME contains the name of the service used by the platform, e.g.: dff_dream_persona_prompted_skill;
  • PROMPT_FILE contains a path to a JSON file containing dictionary with prompt;
  • GENERATIVE_SERVICE_URL contains a URL of the generative service to be used;
  • GENERATIVE_TIMEOUT defines the timeout after reaching which the Generative Skill will stop waiting for response from the Generative AI Model Service specified in the GENERATIVE_SERVICE_URL above, and gradually fail. The service must utilize the same input-output format as Transformers-LM (transformers_lm).
  • N_UTTERANCES_CONTEXT contains lengths of the considered context in terms of number of dialog utterances.

Note: DFF Dream Persona Prompted Skill utilizes a special universal template skills/dff_template_prompted_skill which does not require creation of the new skill's directory. For your convenience, creating a new skill, you should utilize the same template folder but specify another prompt file, service port, and specify another container name.

Prompt Selector

This sample distribution includes one Prompt Selector. Note: In this distribution's docker-compose.override.yml we specify a list of 2 prompts to the Prompt Selector: dream_persona,pizza. This is done just for the demonstration of the PROMPTS_TO_CONSIDER's input format. As this distribution contains only one Generative Skill which utilizes Dream Persona prompt (dream_persona), please remove ,pizza from the docker-compose.override.yml prior to running this distribution.

Skill Selector

You should not do any changes in the Skill Selector, it works automatically.

How to run Dream Prompted Generative AI Assistant Distribution

docker-compose -f docker-compose.yml -f assistant_dists/dream_persona_prompted/docker-compose.override.yml -f assistant_dists/dream_persona_prompted/dev.yml -f assistant_dists/dream_persona_prompted/proxy.yml up --build

Note: In this release, you will need a machine with the powerful-enough GPU to run Generative AI Models like GPT-J locally.

Let's chat! In a separate terminal tab run:

docker-compose exec agent python -m deeppavlov_agent.run agent.channel=cmd agent.pipeline_config=assistant_dists/dream_persona_prompted/pipeline_conf.json

How to Create a New Generative AI Assistant Distribution

If you want to create a new Generative AI Assistant distribution (distribution containing prompt-based Generative Skill(s)), follow these instructions:

Pre-Requisites

  1. Create genassistants folder in the folder of your choice: mkdir genassistants
  2. Move to it: cd genassistants
  3. Clone dreamtools to your local machine: git clone https://github.com/deeppavlov/deeppavlov_dreamtools
  4. Move to their folder: cd deeppavlov_dreamtools
  5. Install dreamtools using pip3: pip3 install -e .
  6. Move back to the genassistants folder: cd ..
  7. Clone dream: git clone https://github.com/deeppavlov/dream
  8. Move to its folder: cd dream
  9. Check that dreamtools work: dreamtools. You should get a message starting with Usage: dreamtools [OPTIONS] COMMAND [ARGS] ....

Create Your Generative AI Assistant Distribution

Note: In this How-To we will create a SpaceX Generative AI Assistant, with the name spacex_ga_assistant.

  1. Use dreamtools to create your own Generative AI Assistant distribution by using the template provided above, where spacex_ga_assistant is the name of your new distribution, and dream_persona_prompted is the template distribution's name:

dreamtools clone dist spacex_ga_assistant --template dream_persona_prompted –display-name “My SpaceX AI Assistant” –description “You could have placed an ad here”

Configure Your Custom Prompt(s)

One Prompt

  1. Create a new prompt and place it in the common/prompts/<prompt_name>.json. Note: In this How-To we will use an existing prompt spacex.json.
  2. In docker-compose.override.yml find prompt-selector, and in its args change PROMPTS_TO_CONSIDER value to spacex.
  3. In docker-compose.override.yml find prompt-selector, and in its args change N_SENTENCES_TO_RETURN to the number of prompted Generative Skills, e.g., 1, as you have just one such skill.

Multiple Prompts

  1. Create new prompts for each of your skills and place them in the common/prompts/ subfolder. Name of these prompts should be used to name their corresponding Generative Skills. It is advised to avoid using hyphens (-) or underscores (_) to make maintenance of the whole system easier and less error-prone. Note: In this How-To we will use existing prompt files spacex.json, pizza.json, and dream_persona.json.
  2. In docker-compose.override.yml find prompt-selector, and in its args change PROMPTS_TO_CONSIDER value to spacex,pizza,dream_persona.
  3. In docker-compose.override.yml find prompt-selector, and in its args change N_SENTENCES_TO_RETURN to the number of prompted Generative Skills, e.g., 3, as you have 3 such skills.

Create Generative Skill(s)

One Generative Skill

If you want to have just one Generative Skill, you can re-use the existing skill, and then change the file with the prompt, and use that to run your solution.

Multiple Generative Skills

To add more Generative Skills, currently you have to make changes to several configuration files. Note: In the future releases, we plan to update our dreamtools to make adding new Generative Skills easier for you. Stay tuned!

Go to your new distribution's subfolder in assistant_dists, like this: cd assistant_dists/spacex_ga_assistant (the one we created above).

Then, for each of your new *Generative Skills, follow these instructions:

dev.yml

  1. Copy & paste these lines to create new Generative Skills:
    dff-dream-persona-prompted-skill:
         volumes:
         - "./skills/dff_template_prompted_skill:/src"
         - "./common:/src/common"
    ports:
         - 8134:8134
  1. Replace dff-dream-persona-prompted-skill with the name of your new Generative Skills based on the names of the prompts you've created above. Note that Skill Selector automatically picks new Generative Skills by replacing <prompt_name> in the dff-<prompt_name>-prompted-skill Note: In this How-To we use existing prompt files spacex.json, pizza.json, and dream_persona.json. This means that you need to add just two more blocks describing your Generative Skills given that the dream-persona skill has been already added to your Distribution at its creation. The names of your new Generative Skills should be as following: dff-spacex-prompted-skill and dff-pizza-prompted-skill.
  2. Replace SERVICE_NAME

docker-compose.override.yml

  1. Copy & paste these lines to create new Generative Skills:
dff-dream-persona-prompted-skill:
   env_file: [ .env ]
   build:
      args:
         SERVICE_PORT: 8134
         SERVICE_NAME: dff_dream_persona_prompted_skill
         PROMPT_FILE: common/prompts/dream_persona.json
         GENERATIVE_SERVICE_URL: http://transformers-lm-gptj:8130/respond
         GENERATIVE_TIMEOUT: 8
         N_UTTERANCES_CONTEXT: 3
      context: .
      dockerfile: ./skills/dff_template_prompted_skill/Dockerfile
   command: gunicorn --workers=1 server:app -b 0.0.0.0:8134 --reload
   deploy:
      resources:
         limits:
            memory: 128M
         reservations:
            memory: 128M
  1. For each of the skills you've defined in the dev.yml, replace these dff-dream-persona-prompted-skill with the name of your new Generative Skills. Note: In this How-To we use existing prompt files spacex.json, pizza.json, and dream_persona.json. This means that you need to add just two more blocks describing your Generative Skills given that the dream-persona skill has been already added to your Distribution at its creation. The names of your new Generative Skills should be as following: dff-spacex-prompted-skill and dff-pizza-prompted-skill.
  2. For each of the skills, change the SERVICE_NAME to reflect service's name, so, for example, for dff-spacex-prompted-skill you should specify SERVICE_NAME: dff_spacex_prompted_skill. _Note: In SERVICE_NAME we use underscores (_) while in the container names in the Docker Compose files we use hyphens (-). It is imperative to follow this notation.
  3. For each of the skills, change the SERVICE_PORT to reflect service's port specified in the dev.yml.
  4. For each of the skills, specify their respective PROMPT_FILE by setting param's value to the relative path to the prompt file, e.g., for dff-spacex-prompted-skill set PROMPT_FILE to common/prompts/spacex.json.
  5. For each of the skills, change the port in the command line to reflect service's port specified in the SERVICE_PORT value above.
  6. If you use different Generative AI Model Services, specify its name in the GENERATIVE_SERVICE_URL.

dp_formatters/state_formatters.py

  1. For each of the skills, duplicate function:
    def dff_dream_persona_prompted_skill_formatter(dialog):
        return utils.dff_formatter(
            dialog, "dff_dream_persona_prompted_skill",
            types_utterances=["human_utterances", "bot_utterances", "utterances"]
        )
  2. Replace dream_persona with the name of that skill in this function (two places), e.g., for dff_spacex_prompted_skill, replace dream_persona with spacex.

assistant_dists/dream_custom_prompted/pipeline_conf.json

  1. For each of your Generative Skills, copy & paste the section that describes the original DFF Dream Persona Prompted Skill:
             "dff_dream_persona_prompted_skill": {
                 "connector": {
                     "protocol": "http",
                     "timeout": 4.5,
                     "url": "http://dff-dream-persona-prompted-skill:8134/respond"
                 },
                 "dialog_formatter": "state_formatters.dp_formatters:dff_dream_persona_prompted_skill_formatter",
                 "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service",
                 "previous_services": [
                     "skill_selectors"
                 ],
                 "state_manager_method": "add_hypothesis"
             },
  2. replace strings dream-persona with <prompt-name> (container names are using hyphens) and dream_persona with <prompt_name> (component names are using underscores). It will change the container name, skill name, formatter name
  3. replace port (8134 in the example) to the assigned one in dream/assistant_dists/dream_custom_prompted/docker-compose.override.yml.
  4. If one does not want to keep DFF Dream Persona Prompted Skill in their distribution, one should remove all mentions of DFF Dream Persona Prompted Skill container from yml-configs and pipeline_conf.json files.

Note: Please, take into account that naming skill utilizing <prompt_name> according to the instruction above is very important to provide Skill Selector automatically turn on the prompt-based skills which are returned as N_SENTENCES_TO_RETURN the most relevant prompts.

How to run your custom Generative AI Assistant Distribution

Note: In this example, we use the spacex_ga_assistant name:

docker-compose -f docker-compose.yml -f assistant_dists/spacex_ga_assistant/docker-compose.override.yml -f assistant_dists/spacex_ga_assistant/dev.yml -f assistant_dists/spacex_ga_assistant/proxy.yml up --build

Let's chat! In a separate terminal tab run:

docker-compose exec agent python -m deeppavlov_agent.run agent.channel=cmd agent.pipeline_config=assistant_dists/spacex_ga_assistant/pipeline_conf.json

Note: Make sure to replace spacex_ga_assistant with the name of your Generative AI Assistant distribution.

Clone this wiki locally