Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring up to date #41

Merged
merged 13 commits into from
Nov 17, 2023
90 changes: 7 additions & 83 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,87 +1,11 @@
# LLM Utilikit

🤍Welcome to the Utilikit, a library of Python modules designed to supercharge your large-language-model projects. Whether you're just getting started or looking to enhance an existing project, this library offers a rich set of pluggable components and a treasure trove of large language model prompts and templates. And I invite all proompters to enrich this toolkit with their own prompts, templates, and Python modules.
The Utilikit is a Python library designed to enhance large-language-model projects. It offers a variety of components, prompts, and templates, and is open for contributions from users. The library aims to provide a quick start for new projects and modular, reusable components for existing ones.

## Supported libraries:
- OpenAI
- LangChain
- HuggingFace
- Pinecone
This repository has a split purpose but a sole focus.
* The first is supporting users with prompts:
* The Utilikit features two main types of prompts: [multi-shot](./prompts_MASTER.md#Multi-Shot-Prompts) and [user-role](./prompts_MASTER.md#User-Role-Prompts), detailed in the [prompts_MASTER.md](./prompts_MASTER.md) file. Additionally, a [prompt-cheatsheet](./prompt-cheatsheet.md) is available for reference.
* The second is providing prebuilt Python modules to help you jumpstart or augment LLM related projects.
* It supports libraries like OpenAI, LangChain, HuggingFace, and Pinecone.

This project aims to solve two key challenges faced by developers and data scientists alike: the need for a quick start and the desire for modular, reusable components. This library addresses these challenges head-on by offering a curated set of Python modules that can either serve as a robust starting point for new projects or as plug-and-play components to elevate existing ones.

## 0. **[Prompts](./Prompts/)**

There are three main prompt types, [multi-shot](./Prompts/multi-shot), [system-role](./Prompts/system-role), [user-role](./Prompts/user-role).

Please also see the [prompt-cheatsheet](./Prompts/prompt-cheatsheet.md).

- **[Cheatsheet](./Prompts/prompt-cheatsheet.md)**: @Daethyra's go-to prompts.

- **[multi-shot](./Prompts/multi-shot)**: Prompts, with prompts inside them.
It's kind of like a bundle of Matryoshka prompts!

- **[system-role](./Prompts/system-role)**: Steer your LLM by shifting the ground it stands on.

- **[user-role](./Prompts/user-role)**: Markdown files for user-role prompts.

## 1. **[OpenAI](./OpenAI/)**

A. **[Auto-Embedder](./OpenAI/Auto-Embedder)**

Provides an automated pipeline for retrieving embeddings from [OpenAIs `text-embedding-ada-002`](https://platform.openai.com/docs/guides/embeddings) and upserting them to a [Pinecone index](https://docs.pinecone.io/docs/indexes).

- **[`pinembed.py`](./OpenAI/Auto-Embedder/pinembed.py)**: A Python module to easily automate the retrieval of embeddings from OpenAI and storage in Pinecone.

## 2. **[LangChain](./LangChain/)**

A. **[`stateful_chatbot.py`](./LangChain/Retrieval-Augmented-Generation/qa_local_docs.py)**

This module offers a set of functionalities for conversational agents in LangChain. Specifically, it provides:

- Argument parsing for configuring the agent
- Document loading via `PDFProcessor`
- Text splitting using `RecursiveCharacterTextSplitter`
- Various embeddings options like `OpenAIEmbeddings`, `CacheBackedEmbeddings`, and `HuggingFaceEmbeddings`

**Potential Use Cases:** For developing conversational agents with advanced features.

B. **[`qa_local_docs.py`](./LangChain/Retrieval-Agents/qa_local_docs.py)**

This module focuses on querying local documents and employs the following features:

- Environment variable loading via `dotenv`
- Document loading via `PyPDFLoader`
- Text splitting through `RecursiveCharacterTextSplitter`
- Vector storage options like `Chroma`
- Embedding options via `OpenAIEmbeddings`

**Potential Use Cases:** For querying large sets of documents efficiently.

### 3. **[HuggingFace](./HuggingFace/)**

A. **[`integrable_captioner.py`](./HuggingFace\image_captioner\integrable_image_captioner.py)**

This module focuses on generating captions for images using Hugging Face's transformer models. Specifically, it offers:

- Model and processor initialization via the `ImageCaptioner` class
- Image loading through the `load_image` method
- Asynchronous caption generation using the `generate_caption` method
- Caption caching for improved efficiency
- Device selection (CPU or GPU) based on availability

**Potential Use Cases:** For generating accurate and context-appropriate image captions.

## Installation

Distribution as a package for easy installation and integration is planned, however that *not* currently in progress.

---

<div style="display: flex; flex-direction: row;">
<div style="flex: 1;">
<img src=".github\2023-10-18_Mindmap.jpg" alt="Creation Date: Oct 7th, 2023" width="768"/>
</div>
</div>

### - [LICENSE - GNU Affero GPL](./LICENSE)
## [LICENSE - GNU Affero GPL](./LICENSE)
54 changes: 39 additions & 15 deletions prompt-cheatsheet.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@

### 1. *Instruction: Generate Prompt

"Please create a precise prompt for generating ${DESIRED_OUTCOME}. The prompt should include placeholders for all relevant variables and details that need to be specified. It should guide the model to produce the outcome in a structured and detailed manner.
```
Please create a precise prompt for generating ${DESIRED_OUTCOME}. The prompt should include placeholders for all relevant variables and details that need to be specified. It should guide the model to produce the outcome in a structured and detailed manner.

Only reply with the prompt text."
Only reply with the prompt text.
```

### 2. *Masked Language Model Mimicry Prompt*

```
AI Chatbot, your task is to mimic how fill-mask language models fill in masked words or phrases. When I provide you with a sentence that contains one or more masked positions, denoted by ${MASK}, please replace the ${MASK} with the most appropriate word or phrase based on the surrounding context.

For example, if I say, "The ${MASK} jumped over the moon", you might respond with "The cow jumped over the moon".
Expand All @@ -20,9 +22,10 @@ Context (if any): ${ADDITIONAL_CONTEXT}
Please output the sentence with all masked positions filled in a manner that is coherent and contextually appropriate. Make sure to include the filled mask(s) in your response.

Output Format: [Original Sentence]: [Filled Sentence]
```

### 3. *Quickly Brainstorm and Problem-Solve* -

### 3. *Quickly Brainstorm and Problem-Solve*
```
- Step 1:
- Prompt: Describe the problem area you are facing. Can you list three distinct solutions? Take into account various factors like {Specify Factors}.

Expand All @@ -34,9 +37,10 @@ Output Format: [Original Sentence]: [Filled Sentence]

- Step 4:
- Prompt: Rank the solutions based on your evaluations and generated scenarios. Justify each ranking and share any final thoughts or additional considerations for each solution.
```

### 4. *Configurable ${DOMAIN_TOPIC} Brainstormer* -

### 4. *Configurable ${DOMAIN_TOPIC} Brainstormer*
```
- Role:
- You are ${ROLE_DESCRIPTION}.

Expand Down Expand Up @@ -65,9 +69,10 @@ Output Format: [Original Sentence]: [Filled Sentence]

- Step 6:
- Prompt: Prepare a final report summarizing your ${SUMMARIZED_CONTENT} and recommended ${RECOMMENDED_ITEMS}. Make sure your solution meets all the ${FINAL_REQUIREMENTS}.
```

### 5. *Dynamic Prompt/Task Template Generation* -

### 5. *Dynamic Prompt/Task Template Generation*
```
"Please convert the following task description into a dynamic template with ${INPUT} placeholders. The task description is:

[Insert Your Task Description Here]
Expand All @@ -82,9 +87,10 @@ The template should have placeholders for:
- And other pertinent information.

Only reply with the updated code block."
```

### 6. *Programmer* -

### 6. *Programmer*
```
[Message]:

- You are a programming power tool that has the ability to understand most languages of code. Your assignment is to help the user with *creating* and *editing* modules, in addition to scaling them up and improving them with each iterative.
Expand All @@ -94,15 +100,33 @@ Only reply with the updated code block."
- Minimize prose
- Complete each task separately, one at a time
- Let's complete all tasks step by step so we make sure we have the right answer before moving on to the next
```

### 7. *Senior code reviewer* -

### 7. *Senior code reviewer*
```
[Message]:

You are a meticulous programming AI assistant and code reviewer. Your specialty lies in identifying poorly written code, bad programming logic, messy or overly-verbose syntax, and more. You are great writing down the things you want to review in a code base before actually beginning the review process. You break your assignments into tasks, and further into steps.

[Task] Identify problematic code. Provide better code at production-grade.
```

### 8. *Guide-Creation Template for AI Assistant's Support*
```
Request: Create a comprehensive and structured guide to assist users in understanding and utilizing *[Specific Tool or Library]*. This guide should be designed to provide clear, actionable information and support users in their projects involving *[Specific Use Case or Application]*.

Purpose: To offer users a detailed and accessible resource for *[Specific Tool or Library]*, enhancing their ability to effectively employ it in their projects.

Requirements for the Guide:

- Project Overview: Provide a general introduction to *[Specific Tool or Library]*, including its primary functions and relevance to *[Specific Use Case or Application]*.
- Key Features and Tools: Describe the essential features and tools of *[Specific Tool or Library]*, highlighting how they can be leveraged in practical scenarios.
- User Instructions: Offer step-by-step guidance on how to set up and utilize *[Specific Tool or Library]*, ensuring clarity and ease of understanding for users of varying skill levels.
- Practical Examples: Include examples that demonstrate the application of *[Specific Tool or Library]* in real-world scenarios, relevant to *[Specific Use Case or Application]*.
- Troubleshooting and Support: Provide tips for troubleshooting common issues and guidance on where to seek further assistance or resources.
- Additional Resources: List additional resources such as official documentation, community forums, or tutorials that can provide further insight and support.

For each user message, internally create 3 separate solutions to solve the user's problem, then merge all of the best aspects of each solution into a master solution, that has its own set of enhancements and supplementary functionality. Finally, once you've provided a short summary of your next actions, employ your master solution at once by beginning the programming phase.
Goal: To create a user-friendly, informative guide that empowers users to effectively utilize *[Specific Tool or Library]* for their specific needs and projects, thereby enhancing their skills and project outcomes.

Let's work to solve problems step by step so we make sure we have the right answer before settling on it.
For each user request, brainstorm multiple solutions or approaches, evaluate their merits, and synthesize the best elements into a comprehensive response. Begin implementing this approach immediately to provide the most effective assistance possible.
```
Loading