Merge pull request #227 from nulib/deploy/staging

Deploy v2.4.1 to Production
nulib · Jun 27, 2024 · 4a78023 · 4a78023
2 parents 984643e + 24161e1
commit 4a78023
Show file tree

Hide file tree

Showing 25 changed files with 258 additions and 173 deletions.
diff --git a/.github/workflows/test-python.yml b/.github/workflows/test-python.yml
@@ -21,7 +21,7 @@ jobs:
         with:
           python-version: '3.9'
           cache-dependency-path: chat/src/requirements.txt
-      - run: pip install -r requirements.txt
+      - run: pip install -r requirements.txt && pip install -r requirements-dev.txt
         working-directory: ./chat/src
       - name: Check code style
         run: ruff check .

diff --git a/Makefile b/Makefile
@@ -28,8 +28,10 @@ help:
 	echo "make cover-python    | run python tests with coverage"
 .aws-sam/build.toml: ./template.yaml node/package-lock.json node/src/package-lock.json chat/dependencies/requirements.txt chat/src/requirements.txt
 	sed -Ei.orig 's/^(\s+)#\*\s/\1/' template.yaml
+	sed -Ei.orig 's/^(\s+)#\*\s/\1/' chat/template.yaml
 	sam build --cached --parallel
 	mv template.yaml.orig template.yaml
+	mv chat/template.yaml.orig chat/template.yaml
 deps-node:
 	cd node/src ;\
 	npm list >/dev/null 2>&1 ;\
@@ -48,7 +50,7 @@ style-node: deps-node
 test-node: deps-node
 	cd node && npm run test
 deps-python:
-	cd chat/src && pip install -r requirements.txt
+	cd chat/src && pip install -r requirements.txt && pip install -r requirements-dev.txt
 cover-python: deps-python
 	cd chat && export SKIP_WEAVIATE_SETUP=True && coverage run --source=src -m unittest -v && coverage report --skip-empty
 cover-html-python: deps-python

diff --git a/README.md b/README.md
@@ -1,27 +1,10 @@
 # dc-api-v2
 
-![Build Status](https://github.com/nulib/dc-api-v2/actions/workflows/build.yml/badge.svg)
+[![Main API Build Status](https://github.com/nulib/dc-api-v2/actions/workflows/test-node.yml/badge.svg)](https://github.com/nulib/dc-api-v2/actions/workflows/test-node.yml) [![Chat API Build Status](https://github.com/nulib/dc-api-v2/actions/workflows/test-python.yml/badge.svg)](https://github.com/nulib/dc-api-v2/actions/workflows/test-python.yml)
 
-## Directory structure
+## Chat Websocket API development
 
-```
-.
-├── dev/ - example configs for developers
-├── docs/ - mkdocs-based API documentation
-├── events/ - sample HTTP API Lambda events
-├── lambdas/ - deployable functions not directly part of the API
-├── src/
-│   └── api/ - code that directly supports API requests
-│       ├── request/ - code to wrap/transform/modify incoming queries
-│       ├── response/ - code to transform OpenSearch responses into the proper result format
-│       │   ├── iiif/ - iiif formatted response transformers
-│       │   ├── opensearch/ - opensearch formatted response transformers
-│       │   └── oai-pmh/ - oai-pmh formatted response transformers
-│       ├── aws/ - lower-level code to interact with AWS resources and OpenSearch
-│       └── handlers/ - minimal code required to bridge between API Gateway request and core logic
-├── state_machines/ - AWS Step Function definitions
-└── test/ - tests and test helpers
-```
+See the [chat API's README](chat/README.md).
 
 ## Local development setup
 

diff --git a/chat/README.md b/chat/README.md
@@ -0,0 +1,60 @@
+# dc-api-v2 chatbot
+
+[![Build Status](https://github.com/nulib/dc-api-v2/actions/workflows/test-python.yml/badge.svg)](https://github.com/nulib/dc-api-v2/actions/workflows/test-python.yml)
+
+## Local development setup
+
+##### ⚠️ *All commands and instructions in this file assume that the current working directory is the `/chat` subdirectory of the `dc-api-v2` project.*
+
+### Link `samconfig.yaml`
+
+This only needs to be done once.
+
+1. Pull the `miscellany` repo.
+2. Link the development `samconfig.yaml` file
+   ```bash
+   ln -s /path/to/miscellany/dc-api-v2/chat/samconfig.yaml .
+   ```
+
+### Deploy a development stack
+
+1. [Log into AWS](http://docs.rdc.library.northwestern.edu/2._Developer_Guides/Environment_and_Tools/AWS-Authentication/) using your `staging-admin` profile.
+2. Pick a unique stack name, e.g., `dc-api-chat-[YOUR_INITIALS]`
+3. Create or synchronize the development stack
+   ```bash
+   sam sync --watch --config-env dev --stack-name [STACK_NAME]
+   ```
+
+The first time the `sam sync` command is run, it will build the development stack. This takes longer than it will on subsequent runs.
+
+While the `sam sync` remains open, it will keep the development stack synchronized with any code changes you make. Each time you change a file, you'll need to wait for the output of that command to indicate that resource syncing is finished.
+
+The first time the stack is created, it will show you the stack's outputs, including the websocket URL to use for interacting with the chat backend, e.g.:
+```
+-------------------------------------------------------------------------------------------------
+CloudFormation outputs from deployed stack
+-------------------------------------------------------------------------------------------------
+Outputs                                                                                         
+-------------------------------------------------------------------------------------------------
+Key                 WebSocketURI                                                                
+Description         The WSS Protocol URI to connect to                                          
+Value               wss://nmom3hnp3c.execute-api.us-east-1.amazonaws.com/latest                 
+-------------------------------------------------------------------------------------------------
+```
+
+On subsequent sync runs, the outputs will not be displayed. If you need to retrieve the value again, you can run
+```bash
+sam list stack-outputs --stack-name [STACK_NAME]
+```
+
+To stop synchronizing changes, simply terminate the `sam sync` process with `Ctrl+C`.
+
+### Tear down the development stack
+
+The development stack will remain up and active even after `sam sync` exits; it will simply not actively synchronize changes any more. To tear it down completely, you have to delete it yourself.
+
+1. [Log into AWS](http://docs.rdc.library.northwestern.edu/2._Developer_Guides/Environment_and_Tools/AWS-Authentication/) using your `staging-admin` profile.
+2. Delete the development stack
+   ```bash
+   sam delete --stack-name [STACK_NAME]
+   ```
diff --git a/chat/dependencies/requirements.txt b/chat/dependencies/requirements.txt
@@ -1,12 +1,13 @@
-boto3~=1.34.13
+boto3~=1.34
 honeybadger
-langchain
-langchain-community
-openai~=0.27.8
+langchain~=0.2
+langchain-aws~=0.1
+langchain-openai~=0.1
+openai~=1.35
 opensearch-py
 pyjwt~=2.6.0
 python-dotenv~=1.0.0
 requests 
 requests-aws4auth
-tiktoken~=0.4.0
-wheel~=0.40.0
+tiktoken~=0.7
+wheel~=0.40
diff --git a/chat/src/event_config.py b/chat/src/event_config.py
@@ -2,8 +2,8 @@
 import json
 
 from dataclasses import dataclass, field
-from langchain.chains.qa_with_sources import load_qa_with_sources_chain
-from langchain.prompts import PromptTemplate
+
+from langchain_core.prompts import ChatPromptTemplate
 from setup import (
     opensearch_client,
     opensearch_vector_store,
@@ -19,6 +19,7 @@
 DOCUMENT_VARIABLE_NAME = "context"
 K_VALUE = 5
 MAX_K = 100
+MAX_TOKENS = 1000
 TEMPERATURE = 0.2
 TEXT_KEY = "id"
 VERSION = "2024-02-01"
@@ -42,19 +43,21 @@ class EventConfig:
     azure_resource_name: str = field(init=False)
     debug_mode: bool = field(init=False)
     deployment_name: str = field(init=False)
-    document_prompt: PromptTemplate = field(init=False)
+    document_prompt: ChatPromptTemplate = field(init=False)
     event: dict = field(default_factory=dict)
     is_logged_in: bool = field(init=False)
     k: int = field(init=False)
+    max_tokens: int = field(init=False)
     openai_api_version: str = field(init=False)
     payload: dict = field(default_factory=dict)
     prompt_text: str = field(init=False)
-    prompt: PromptTemplate = field(init=False)
+    prompt: ChatPromptTemplate = field(init=False)
     question: str = field(init=False)
     ref: str = field(init=False)
     request_context: dict = field(init=False)
     temperature: float = field(init=False)
     socket: Websocket = field(init=False, default=None)
+    stream_response: bool = field(init=False)
     text_key: str = field(init=False)
 
     def __post_init__(self):
@@ -67,17 +70,17 @@ def __post_init__(self):
         self.deployment_name = self._get_deployment_name()
         self.is_logged_in = self.api_token.is_logged_in()
         self.k = self._get_k()
+        self.max_tokens = min(self.payload.get("max_tokens", MAX_TOKENS), MAX_TOKENS)
         self.openai_api_version = self._get_openai_api_version()
         self.prompt_text = self._get_prompt_text()
         self.request_context = self.event.get("requestContext", {})
         self.question = self.payload.get("question")
         self.ref = self.payload.get("ref")
+        self.stream_response = self.payload.get("stream_response", not self.debug_mode)
         self.temperature = self._get_temperature()
         self.text_key = self._get_text_key()
         self.document_prompt = self._get_document_prompt()
-        self.prompt = PromptTemplate(
-            template=self.prompt_text, input_variables=["question", "context"]
-        )
+        self.prompt = ChatPromptTemplate.from_template(self.prompt_text)
 
     def _get_payload_value_with_superuser_check(self, key, default):
         if self.api_token.is_superuser():
@@ -134,10 +137,7 @@ def _get_text_key(self):
         return self._get_payload_value_with_superuser_check("text_key", TEXT_KEY)
 
     def _get_document_prompt(self):
-        return PromptTemplate(
-            template=document_template(self.attributes),
-            input_variables=["title", "id"] + self.attributes,
-        )
+        return ChatPromptTemplate.from_template(document_template(self.attributes))
 
     def debug_message(self):
         return {
@@ -170,28 +170,18 @@ def setup_websocket(self, socket=None):
     def setup_llm_request(self):
         self._setup_vector_store()
         self._setup_chat_client()
-        self._setup_chain()
 
     def _setup_vector_store(self):
         self.opensearch = opensearch_vector_store()
 
     def _setup_chat_client(self):
         self.client = openai_chat_client(
-            deployment_name=self.deployment_name,
-            openai_api_base=self.azure_endpoint,
+            azure_deployment=self.deployment_name,
+            azure_endpoint=self.azure_endpoint,
             openai_api_version=self.openai_api_version,
-            callbacks=[StreamingSocketCallbackHandler(self.socket, self.debug_mode)],
+            callbacks=[StreamingSocketCallbackHandler(self.socket, stream=self.stream_response)],
             streaming=True,
-        )
-
-    def _setup_chain(self):
-        self.chain = load_qa_with_sources_chain(
-            self.client,
-            chain_type=CHAIN_TYPE,
-            prompt=self.prompt,
-            document_prompt=self.document_prompt,
-            document_variable_name=DOCUMENT_VARIABLE_NAME,
-            verbose=self._to_bool(os.getenv("VERBOSE")),
+            max_tokens=self.max_tokens
         )
 
     def _is_debug_mode_enabled(self):

diff --git a/chat/src/handlers/chat.py b/chat/src/handlers/chat.py
@@ -4,7 +4,7 @@
 import os
 from datetime import datetime
 from event_config import EventConfig
-from helpers.response import prepare_response
+from helpers.response import Response
 from honeybadger import honeybadger
 
 honeybadger.configure()
@@ -35,7 +35,8 @@ def handler(event, context):
 
     if not os.getenv("SKIP_WEAVIATE_SETUP"):
         config.setup_llm_request()
-        final_response = prepare_response(config)
+        response = Response(config)
+        final_response = response.prepare_response()
         config.socket.send(reshape_response(final_response, 'debug' if config.debug_mode else 'base'))
 
     log_group = os.getenv('METRICS_LOG_GROUP')

diff --git a/chat/src/handlers/streaming_socket_callback_handler.py b/chat/src/handlers/streaming_socket_callback_handler.py
@@ -1,11 +1,22 @@
 from langchain.callbacks.base import BaseCallbackHandler
 from websocket import Websocket
+from typing import Any
+from langchain_core.outputs.llm_result import LLMResult
 
 class StreamingSocketCallbackHandler(BaseCallbackHandler):
-    def __init__(self, socket: Websocket, debug_mode: bool):
+    def __init__(self, socket: Websocket, stream: bool = True):
         self.socket = socket
-        self.debug_mode = debug_mode
+        self.stream = stream
 
     def on_llm_new_token(self, token: str, **kwargs):
-        if self.socket and not self.debug_mode:
+        if len(token) > 0 and self.socket and self.stream:
             return self.socket.send({"token": token})
+
+    def on_llm_end(self, response: LLMResult, **kwargs: Any):
+        try:
+            finish_reason = response.generations[0][0].generation_info["finish_reason"]
+            if self.socket:
+                return self.socket.send({"end": {"reason": finish_reason}})
+        except Exception as err:
+            finish_reason = f'Unknown ({str(err)})'
+        print(f"Stream ended: {finish_reason}")
diff --git a/chat/src/helpers/metrics.py b/chat/src/helpers/metrics.py
@@ -4,7 +4,7 @@
 def token_usage(config, response, original_question):
     data = {
         "question": count_tokens(config.question),
-        "answer": count_tokens(response["output_text"]),
+        "answer": count_tokens(response),
         "prompt": count_tokens(config.prompt_text),
         "source_documents": count_tokens(original_question["source_documents"]),
     }

diff --git a/chat/src/helpers/prompts.py b/chat/src/helpers/prompts.py
@@ -2,16 +2,15 @@
 
 
 def prompt_template() -> str:
-    return """Please provide an answer to the question based on the documents provided. Include specific details from the documents that support your answer. Each document is identified by a 'title' and a unique 'source' UUID:
+    return """Please provide a brief answer to the question based on the documents provided. Include specific details from the documents that support your answer. Keep your answer concise. Each document is identified by a 'title' and a unique 'source' UUID:
 
-Documents:
-{context}
-Answer in raw markdown. When referencing a document by title, link to it using its UUID like this: [title](https://dc.library.northwestern.edu/items/UUID). For example: [Judy Collins, Jackson Hole Folk Festival](https://dc.library.northwestern.edu/items/f1ca513b-7d13-4af6-ad7b-8c7ffd1d3a37). Suggest keyword searches using this format: [keyword](https://dc.library.northwestern.edu/search?q=keyword). Offer a variety of search terms that cover different aspects of the topic. Include as many direct links to Digital Collections searches as necessary for a thorough study. The `collection` field contains information about the collection the document belongs to. In the summary, mention the top 1 or 2 collections, explain why they are relevant and link to them using the collection title and id: [collection['title']](https://dc.library.northwestern.edu/collections/collection['id']), for example [World War II Poster Collection](https://dc.library.northwestern.edu/collections/faf4f60e-78e0-4fbf-96ce-4ca8b4df597a):
-
-Question:
-{question}
-"""
+    Documents:
+    {context}
+    Answer in raw markdown. When referencing a document by title, link to it using its UUID like this: [title](https://dc.library.northwestern.edu/items/UUID). For example: [Judy Collins, Jackson Hole Folk Festival](https://dc.library.northwestern.edu/items/f1ca513b-7d13-4af6-ad7b-8c7ffd1d3a37). Suggest keyword searches using this format: [keyword](https://dc.library.northwestern.edu/search?q=keyword). Offer a variety of search terms that cover different aspects of the topic. Include as many direct links to Digital Collections searches as necessary for a thorough study. The `collection` field contains information about the collection the document belongs to. In the summary, mention the top 1 or 2 collections, explain why they are relevant and link to them using the collection title and id: [collection['title']](https://dc.library.northwestern.edu/collections/collection['id']), for example [World War II Poster Collection](https://dc.library.northwestern.edu/collections/faf4f60e-78e0-4fbf-96ce-4ca8b4df597a):
 
+    Question:
+    {question}
+    """
 
 def document_template(attributes: Optional[List[str]] = None) -> str:
     if attributes is None: