Merge branch 'main' into dspy-snowflake

stanfordnlp · May 6, 2024 · e6afb61 · e6afb61
2 parents 4f10658 + b05394b
commit e6afb61
Show file tree

Hide file tree

Showing 55 changed files with 3,675 additions and 749 deletions.
diff --git a/.gitignore b/.gitignore
@@ -47,3 +47,5 @@ assertion.log
 *.log
 *.db
 /.devcontainer/.personalization.sh
+
+.mypy_cache
diff --git a/README.md b/README.md
@@ -105,6 +105,7 @@ The DSPy documentation is divided into **tutorials** (step-by-step illustration
 - Interviews: [Weaviate Podcast in-person](https://www.youtube.com/watch?v=CDung1LnLbY), and you can find 6-7 other remote podcasts on YouTube from a few different perspectives/audiences.
 - **Tracing in DSPy** with Arize Phoenix: [Tutorial for tracing your prompts and the steps of your DSPy programs](https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/tracing/dspy_tracing_tutorial.ipynb)
 - [DSPy: Not Your Average Prompt Engineering](https://jina.ai/news/dspy-not-your-average-prompt-engineering), why it's crucial for future prompt engineering, and yet why it is challenging for prompt engineers to learn.
+- **Tracing & Optimization Tracking in DSPy** with Parea AI: [Tutorial on tracing & evaluating a DSPy RAG program](https://docs.parea.ai/tutorials/dspy-rag-trace-evaluate/tutorial)
 
 ### B) Guides
 
@@ -136,24 +137,28 @@ You can find other examples tweeted by [@lateinteraction](https://twitter.com/la
 
 **Some other examples (not exhaustive, feel free to add more via PR):**
 
+
+- [DSPy Optimizers Benchmark on a bunch of different tasks, by Michael Ryan](https://github.com/stanfordnlp/dspy/tree/main/testing/tasks)
+- [Sophisticated Extreme Multi-Class Classification, IReRa, by Karel D’Oosterlinck](https://github.com/KarelDO/xmc.dspy)
+- [Haize Lab's Red Teaming with DSPy](https://blog.haizelabs.com/posts/dspy/) and see [their DSPy code](https://github.com/haizelabs/dspy-redteam)
 - Applying DSPy Assertions
   - [Long-form Answer Generation with Citations, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/longformqa/longformqa_assertions.ipynb)
   - [Generating Answer Choices for Quiz Questions, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/quiz/quiz_assertions.ipynb)
   - [Generating Tweets for QA, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/tweets/tweets_assertions.ipynb)
 - [Compiling LCEL runnables from LangChain in DSPy](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/compiling_langchain.ipynb)
 - [AI feedback, or writing LM-based metrics in DSPy](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/tweet_metric.py)
-- [DSPy Optimizers Benchmark on a bunch of different tasks, by Michael Ryan](https://github.com/stanfordnlp/dspy/tree/main/testing/tasks)
+- [DSPy Optimizers Benchmark on a bunch of different tasks, by Michael Ryan](https://github.com/stanfordnlp/dspy/tree/main/testing/README.md)
 - [Indian Languages NLI with gains due to compiling by Saiful Haq](https://github.com/saifulhaq95/DSPy-Indic/blob/main/indicxlni.ipynb)
-- [Sophisticated Extreme Multi-Class Classification, IReRa, by Karel D’Oosterlinck](https://github.com/KarelDO/xmc.dspy)
 - [DSPy on BIG-Bench Hard Example, by Chris Levy](https://drchrislevy.github.io/posts/dspy/dspy.html)
 - [Using Ollama with DSPy for Mistral (quantized) by @jrknox1977](https://gist.github.com/jrknox1977/78c17e492b5a75ee5bbaf9673aee4641)
-- [Using DSPy, "The Unreasonable Effectiveness of Eccentric Automatic Prompts" (paper) by VMware's Rick Battle & Teja Gollapudi, and interview at TheRegister](https://www.theregister.com/2024/02/22/prompt_engineering_ai_models/)
+- [Using DSPy, "The Unreasonable Effectiveness of Eccentric Automatic Prompts" (paper) by VMware's Rick Battle & Teja Gollapudi](https://arxiv.org/abs/2402.10949), and [interview at TheRegister](https://www.theregister.com/2024/02/22/prompt_engineering_ai_models/)
 - [Optimizing Performance of Open Source LM for Text-to-SQL using DSPy and vLLM, by Juan Ovalle](https://github.com/jjovalle99/DSPy-Text2SQL)
 - Typed DSPy (contributed by [@normal-computing](https://github.com/normal-computing))
   - [Using DSPy to train Gpt 3.5 on HumanEval by Thomas Ahle](https://github.com/stanfordnlp/dspy/blob/main/examples/functional/functional.ipynb)
   - [Building a chess playing agent using DSPy by Franck SN](https://medium.com/thoughts-on-machine-learning/building-a-chess-playing-agent-using-dspy-9b87c868f71e)
 
-TODO: Add links to the state-of-the-art results on Theory of Mind (ToM) by Plastic Labs, the results by Haize Labs for Red Teaming with DSPy, and the DSPy pipeline from Replit.
+
+TODO: Add links to the state-of-the-art results by the University of Toronto on Clinical NLP, on Theory of Mind (ToM) by Plastic Labs, and the DSPy pipeline from Replit.
 
 There are also recent cool examples at [Weaviate's DSPy cookbook](https://github.com/weaviate/recipes/tree/main/integrations/dspy) by Connor Shorten. [See tutorial on YouTube](https://www.youtube.com/watch?v=CEuUG4Umfxs).
 

diff --git a/docs/api/language_model_clients/AzureOpenAI.md b/docs/api/language_model_clients/AzureOpenAI.md
@@ -14,6 +14,8 @@ lm = dspy.AzureOpenAI(api_base='...', api_version='2023-12-01-preview', model='g
 
 The constructor initializes the base class `LM` and verifies the provided arguments like the `api_provider`, `api_key`, and `api_base` to set up OpenAI request retrieval through Azure. The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the GPT API, such as `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, and `n`.
 
+Azure requires that the deployment id of the Azure deployment to be also provided using the argument `deployment_id`.
+
 ```python
 class AzureOpenAI(LM):
     def __init__(
@@ -53,4 +55,4 @@ After generation, the completions are post-processed based on the `model_type` p
 - `**kwargs`: Additional keyword arguments for completion request.
 
 **Returns:**
-- `List[Dict[str, Any]]`: List of completion choices.
+- `List[Dict[str, Any]]`: List of completion choices.
diff --git a/docs/api/language_model_clients/Watsonx.md b/docs/api/language_model_clients/Watsonx.md
@@ -0,0 +1,55 @@
+# Watsonx Usage Guide
+
+This guide provides instructions on how to use the `Watsonx` class to interact with  IBM Watsonx.ai API for text and code generation.
+
+## Requirements
+
+- Python 3.10 or higher.
+- The `ibm-watsonx-ai` package installed, which can be installed via pip.
+- An IBM Cloud account and a Watsonx configured project.
+
+## Installation
+
+Ensure you have installed the `ibm-watsonx-ai` package along with other necessary dependencies:
+
+## Configuration
+
+Before using the `Watsonx` class, you need to set up access to IBM Cloud:
+
+1. Create an IBM Cloud account
+2. Enable a Watsonx service from the catalog
+3. Create a new project and associate a Watson Machine Learning service instance.
+4. Create an IAM authentication credentials and save them in a JSON file.
+
+## Usage
+
+Here's an example of how to instantiate the `Watsonx` class and send a generation request:
+
+```python
+import dspy
+
+''' Initialize the class with the model name and parameters for Watsonx.ai
+    You can choose between many different models:
+    * (Mistral) mistralai/mixtral-8x7b-instruct-v01
+    * (Meta) meta-llama/llama-3-70b-instruct
+    * (IBM) ibm/granite-13b-instruct-v2
+    * and many others.
+'''
+watsonx=dspy.Watsonx(
+    model='mistralai/mixtral-8x7b-instruct-v01',
+    credentials={
+        "apikey": "your-api-key",
+        "url": "https://us-south.ml.cloud.ibm.com"
+    },
+    project_id="your-watsonx-project-id",
+    max_new_tokens=500,
+    max_tokens=1000
+    )
+
+dspy.settings.configure(lm=watsonx)
+```
+
+## Customizing Requests
+
+You can customize requests by passing additional parameters such as `decoding_method`,`max_new_tokens`, `stop_sequences`, `repetition_penalty`, and others supported by the Watsonx.ai API. This allows you to control the behavior of the generation.
+Refer to [`ibm-watsonx-ai library`](https://ibm.github.io/watsonx-ai-python-sdk/index.html) documentation.
diff --git a/docs/api/modules/ChainOfThought.md b/docs/api/modules/ChainOfThought.md
@@ -13,23 +13,20 @@ class ChainOfThought(Predict):
 
         self.activated = activated
 
-        signature = self.signature
-        *keys, last_key = signature.kwargs.keys()
-
-        DEFAULT_RATIONALE_TYPE = dsp.Type(prefix="Reasoning: Let's think step by step in order to",
-                                          desc="${produce the " + last_key + "}. We ...")
-
-        rationale_type = rationale_type or DEFAULT_RATIONALE_TYPE
-
-        extended_kwargs = {key: signature.kwargs[key] for key in keys}
-        extended_kwargs.update({'rationale': rationale_type, last_key: signature.kwargs[last_key]})
-
-        self.extended_signature = dsp.Template(signature.instructions, **extended_kwargs)
+        signature = ensure_signature(self.signature)
+        *_keys, last_key = signature.output_fields.keys()
+
+        rationale_type = rationale_type or dspy.OutputField(
+            prefix="Reasoning: Let's think step by step in order to",
+            desc="${produce the " + last_key + "}. We ...",
+        )
+
+        self.extended_signature = signature.prepend("rationale", rationale_type, type_=str)
 ```
 
 **Parameters:**
 - `signature` (_Any_): Signature of predictive model.
-- `rationale_type` (_dsp.Type_, _optional_): Rationale type for reasoning steps. Defaults to `None`.
+- `rationale_type` (_dspy.OutputField_, _optional_): Rationale type for reasoning steps. Defaults to `None`.
 - `activated` (_bool_, _optional_): Flag for activated chain of thought processing. Defaults to `True`.
 - `**config` (_dict_): Additional configuration parameters for model.
 
@@ -64,3 +61,15 @@ pred = generate_answer(question=question)
 print(f"Question: {question}")
 print(f"Predicted Answer: {pred.answer}")
 ```
+
+The following example shows how to specify your custom rationale. Here `answer` corresponds to the last key to produce, it may be different in your case. 
+
+```python
+#define a custom rationale
+rationale_type = dspy.OutputField(
+            prefix="Reasoning: Let's think step by step in order to",
+            desc="${produce the answer}. We ...",
+        )
+#Pass signature to ChainOfThought module
+generate_answer = dspy.ChainOfThought(BasicQA, rationale_type=rationale_type)
+```
diff --git a/docs/api/modules/ChainOfThoughtWithHint.md b/docs/api/modules/ChainOfThoughtWithHint.md
@@ -8,32 +8,23 @@ The constructor initializes the `ChainOfThoughtWithHint` class and sets up its a
 class ChainOfThoughtWithHint(Predict):
     def __init__(self, signature, rationale_type=None, activated=True, **config):
         super().__init__(signature, **config)
-
         self.activated = activated
-
         signature = self.signature
-        *keys, last_key = signature.kwargs.keys()
-
-        DEFAULT_HINT_TYPE = dsp.Type(prefix="Hint:", desc="${hint}")
-
-        DEFAULT_RATIONALE_TYPE = dsp.Type(prefix="Reasoning: Let's think step by step in order to",
-                                          desc="${produce the " + last_key + "}. We ...")
 
-        rationale_type = rationale_type or DEFAULT_RATIONALE_TYPE
-
-        extended_kwargs1 = {key: signature.kwargs[key] for key in keys}
-        extended_kwargs1.update({'rationale': rationale_type, last_key: signature.kwargs[last_key]})
+        *keys, last_key = signature.fields.keys()
+        rationale_type = rationale_type or dspy.OutputField(
+            prefix="Reasoning: Let's think step by step in order to",
+            desc="${produce the " + last_key + "}. We ...",
+        )
+        self.extended_signature1 = self.signature.insert(-2, "rationale", rationale_type, type_=str)
 
-        extended_kwargs2 = {key: signature.kwargs[key] for key in keys}
-        extended_kwargs2.update({'hint': DEFAULT_HINT_TYPE, 'rationale': rationale_type, last_key: signature.kwargs[last_key]})
-
-        self.extended_signature1 = dsp.Template(signature.instructions, **extended_kwargs1)
-        self.extended_signature2 = dsp.Template(signature.instructions, **extended_kwargs2)
+        DEFAULT_HINT_TYPE = dspy.OutputField()
+        self.extended_signature2 = self.extended_signature1.insert(-2, "hint", DEFAULT_HINT_TYPE, type_=str)
 ```
 
 **Parameters:**
 - `signature` (_Any_): Signature of predictive model.
-- `rationale_type` (_dsp.Type_, _optional_): Rationale type for reasoning steps. Defaults to `None`.
+- `rationale_type` (_dspy.OutputField_, _optional_): Rationale type for reasoning steps. Defaults to `None`.
 - `activated` (_bool_, _optional_): Flag for activated chain of thought processing. Defaults to `True`.
 - `**config` (_dict_): Additional configuration parameters for model.
 

diff --git a/docs/docs/building-blocks/1-language_models.md b/docs/docs/building-blocks/1-language_models.md
@@ -10,7 +10,7 @@ Let's first make sure you can set up your language model. DSPy support clients f
 
 ## Setting up the LM client.
 
-You can just call the constructor that connects to the LM. Then, use `dspy.configure` to declare this as the dexfault LM.
+You can just call the constructor that connects to the LM. Then, use `dspy.configure` to declare this as the default LM.
 
 For example, to use OpenAI language models, you can do it as follows.
 

diff --git a/docs/docs/building-blocks/3-modules.md b/docs/docs/building-blocks/3-modules.md
@@ -6,7 +6,7 @@ sidebar_position: 3
 
 A **DSPy module** is a building block for programs that use LMs.
 
-- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature].
+- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature](https://dspy-docs.vercel.app/docs/building-blocks/signatures).
 
 - A DSPy module has **learnable parameters** (i.e., the little pieces comprising the prompt and the LM weights) and can be invoked (called) to process inputs and return outputs.
 
@@ -17,7 +17,7 @@ A **DSPy module** is a building block for programs that use LMs.
 
 Let's start with the most fundamental module, `dspy.Predict`. Internally, all other DSPy modules are just built using `dspy.Predict`.
 
-We'll assume you are already at least a little familiar with [DSPy signatures], which are declarative specs for defining the behavior of any module we use in DSPy.
+We'll assume you are already at least a little familiar with [DSPy signatures](https://dspy-docs.vercel.app/docs/building-blocks/signatures), which are declarative specs for defining the behavior of any module we use in DSPy.
 
 To use a module, we first **declare** it by giving it a signature. Then we **call** the module with the input arguments, and extract the output fields!
 

diff --git a/docs/docs/building-blocks/7-assertions.md b/docs/docs/building-blocks/7-assertions.md
@@ -30,7 +30,7 @@ Specifically, when a constraint is not met:
     - Past Output: your model's past output that did not pass the validation_fn
     - Instruction: your user-defined feedback message on what went wrong and what possibly to fix
 
-If the error continues past the `max_backtracking_attempts`, then `dspy.Assert` will halt the pipeline execution, altering you with an `dspy.AssertionError`. This ensures your program doesn't continue executing with “bad” LM behavior and immediately highlights sample failure outputs for user assessment. 
+If the error continues past the `max_backtracking_attempts`, then `dspy.Assert` will halt the pipeline execution, alerting you with an `dspy.AssertionError`. This ensures your program doesn't continue executing with “bad” LM behavior and immediately highlights sample failure outputs for user assessment.
 
 - **dspy.Suggest vs. dspy.Assert**: `dspy.Suggest` on the other hand offers a softer approach. It maintains the same retry backtracking as `dspy.Assert` but instead serves as a gentle nudger. If the model outputs cannot pass the model constraints after the `max_backtracking_attempts`, `dspy.Suggest` will log the persistent failure and continue execution of the program on the rest of the data. This ensures the LM pipeline works in a "best-effort" manner without halting execution. 
 

diff --git a/docs/docs/building-blocks/8-typed_predictors.md b/docs/docs/building-blocks/8-typed_predictors.md
@@ -67,8 +67,8 @@ prediction = predictor(input=doc_query_pair)
 Let's see the output and its type.
 
 ```python
-answer = prediction.answer
-confidence_score = prediction.confidence
+answer = prediction.output.answer
+confidence_score = prediction.output.confidence
 
 print(f"Prediction: {prediction}\n\n")
 print(f"Answer: {answer}, Answer Type: {type(answer)}")

diff --git a/docs/docs/cheatsheet.md b/docs/docs/cheatsheet.md
@@ -177,7 +177,7 @@ print(f"Question: {question}")
 print(f"Final Predicted Answer (after ReAct process): {result.answer}")
 ```
 
-### dspy.Retreive
+### dspy.Retrieve
 
 ```python
 colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
@@ -233,7 +233,7 @@ class FactJudge(dspy.Signature):
     context = dspy.InputField(desc="Context for the prediciton")
     question = dspy.InputField(desc="Question to be answered")
     answer = dspy.InputField(desc="Answer for the question")
-    factually_correct = dspy.OutputField(desc="Is the answer factually correct based on the context?", prefix="Facual[Yes/No]:")
+    factually_correct = dspy.OutputField(desc="Is the answer factually correct based on the context?", prefix="Factual[Yes/No]:")
 
 judge = dspy.ChainOfThought(FactJudge)
 
@@ -374,7 +374,7 @@ compiled_program_optimized_signature = copro_teleprompter.compile(your_dspy_prog
 ```python
 from dspy.teleprompt import MIPRO
 
-teleprompter = MIPRO(prompt_model=model_to_generate_prompts, task_model=model_that_solves_task, metric=your_defined_metric, n=num_new_prompts_generated, init_temperature=prompt_generation_temperature)
+teleprompter = MIPRO(prompt_model=model_to_generate_prompts, task_model=model_that_solves_task, metric=your_defined_metric, num_candidates=num_new_prompts_generated, init_temperature=prompt_generation_temperature)
 
 kwargs = dict(num_threads=NUM_THREADS, display_progress=True, display_table=0)
 

diff --git a/docs/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx b/docs/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx
@@ -61,7 +61,7 @@ def __call__(self, query:str, k:int) -> List[str]:
     params = {"query": query, "k": k}
     response = requests.get(self.url, params=params)
 
-    response = response.json()["retreived_passages"]    # List of top-k passages
+    response = response.json()["retrieved_passages"]    # List of top-k passages
     return response
 ```
 
@@ -238,4 +238,4 @@ If an `rm` is not initialized in `dsp.settings`, this would raise an error.
 
 ***
 
-<AuthorDetails name="Arnav Singhvi"/>
+<AuthorDetails name="Arnav Singhvi"/>
diff --git a/docs/docs/faqs.md b/docs/docs/faqs.md
@@ -32,6 +32,21 @@ You can specify multiple output fields. For the short-form signature, you can li
 
 You can specify the generation of long responses as a `dspy.OutputField`. To ensure comprehensive checks of the content within the long-form generations, you can indicate the inclusion of citations per referenced context. Such constraints such as response length or citation inclusion can be stated through Signature descriptions, or concretely enforced through DSPy Assertions. Check out the [LongFormQA notebook](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/longformqa/longformqa_assertions.ipynb) to learn more about **Generating long-form length responses to answer questions**.
 
+- **How can I ensure that DSPy doesn't strip new line characters from my inputs or outputs?**
+
+DSPy uses [Signatures](https://dspy-docs.vercel.app/docs/deep-dive/signature/understanding-signatures) to format prompts passed into LMs. In order to ensure that new line characters aren't stripped from longer inputs, you must specify `format=str` when creating a field.
+
+```python
+class UnstrippedSignature(dspy.Signature):
+    """Enter some information for the model here."""
+
+    title = dspy.InputField()
+    object = dspy.InputField(format=str)
+    result = dspy.OutputField(format=str)
+```
+
+`object` can now be a multi-line string without issue.
+
 - **How do I define my own metrics? Can metrics return a float?**
 
 You can define metrics as simply Python functions that process model generations and evaluate them based on user-defined requirements. Metrics can compare existent data (e.g. gold labels) to model predictions or they can be used to assess various components of an output using validation feedback from LMs (e.g. LLMs-as-Judges). Metrics can return `bool`, `int`, and `float` types scores. Check out the official [Metrics docs](https://dspy-docs.vercel.app/docs/building-blocks/metrics) to learn more about defining custom metrics and advanced evaluations using AI feedback and/or DSPy programs.
-Original file line number
+Diff line change
@@ Expand Up / @@ -47,3 +47,5 @@ assertion.log @@
     *.log
     *.db
     /.devcontainer/.personalization.sh
+    .mypy_cache