Update the documentation

dottxt-ai · Jan 26, 2024 · ead42b3 · ead42b3
1 parent 6c93534
commit ead42b3
Show file tree

Hide file tree

Showing 2 changed files with 80 additions and 160 deletions.
diff --git a/docs/reference/models/openai.md b/docs/reference/models/openai.md
@@ -27,49 +27,6 @@ model = models.openai("gpt-4", system_prompt="You are a useful assistant")
 
 This message will be used for every subsequent use of the model:
 
-## Usage
-
-### Call the model
-
-OpenAI models can be directly called with a prompt:
-
-```python
-from outlines import models
-
-model = models.openai("gpt-3.5-turbo")
-result = model("Say something", temperature=0, samples=2)
-```
-
-!!! warning
-
-    This syntax will soon be deprecated and one will be able to generate text with OpenAI models with the same syntax used to generate text with Open Source models.
-
-### Stop when a sequence is found
-
-The OpenAI API tends to be chatty and it can be useful to stop the generation once a given sequence has been found, instead of paying for the extra tokens and needing to post-process the output. For instance if you only to generate a single sentence:
-
-```python
-from outlines import models
-
-model = models.openai("gpt-4")
-response = model("Write a sentence", stop_at=['.'])
-```
-
-### Choose between multiple choices
-
-It can be difficult to deal with a classification problem with the OpenAI API. However well you prompt the model, chances are you are going to have to post-process the output anyway. Sometimes the model will even make up choices. Outlines allows you to *guarantee* that the output of the model will be within a set of choices:
-
-```python
-from outlines import models
-
-model = models.openai("gpt-3.5-turbo")
-result = model.generate_choice("Red or blue?", ["red", "blue"])
-```
-
-!!! warning
-
-    This syntax will soon be deprecated and one will be able to generate text with OpenAI models with the same syntax used to generate text with Open Source models.
-
 ## Monitoring API use
 
 It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:
@@ -89,123 +46,6 @@ print(model.completion_tokens)
 These numbers are updated every time you call the model.
 
 
-## Vectorized calls
-
-A unique feature of Outlines is that calls to the OpenAI API are *vectorized* (In the [NumPy sense](https://numpy.org/doc/stable/reference/generated/numpy.vectorize.html) of the word). In plain English this means that you can call an Openai model with an array of prompts with arbitrary shape to an OpenAI model and it will return an array of answers. All calls are executed concurrently, which means this takes roughly the same time as calling the model with a single prompt:
-
-```python
-from outlines import models
-from outlines import text
-
-@text.prompt
-def template(input_numbers):
-    """Use these numbers and basic arithmetic to get 24 as a result:
-
-    Input: {{ input_numbers }}
-    Steps: """
-
-prompts = [
-    template([1, 2, 3]),
-    template([5, 9, 7]),
-    template([10, 12])
-]
-
-model = models.openai("text-davinci-003")
-results = model(prompts)
-print(results.shape)
-# (3,)
-
-print(type(results))
-# <class 'numpy.ndarray'>
-
-print(results)
-# [
-#     "\n1. 1 + 2 x 3 = 7\n2. 7 + 3 x 4 = 19\n3. 19 + 5 = 24",
-#     "\n1. Add the three numbers together: 5 + 9 + 7 = 21\n2. Subtract 21 from 24: 24 - 21 = 3\n3. Multiply the remaining number by itself: 3 x 3 = 9\n4. Add the number with the multiplication result: 21 + 9 = 24",
-#    "\n\n1. Add the two numbers together: 10 + 12 = 22 \n2. Subtract one of the numbers: 22 - 10 = 12 \n3. Multiply the two numbers together: 12 x 12 = 144 \n4. Divide the first number by the result: 144 / 10 = 14.4 \n5. Add the initial two numbers together again: 14.4 + 12 = 26.4 \n6. Subtract 2: 26.4 - 2 = 24",
-# ]
-```
-
-Beware that in this case the output of the model is a NumPy array. So if you want to concatenate the prompt to the result you have to use `numpy.char.add`:
-
-```python
-import numpy as np
-
-new_prompts = np.char.add(prompts, results)
-print(new_prompts)
-
-# [
-#     "Use these numbers and basic arithmetic to get 24 as a result:\n\nInput: [1, 2, 3]\nSteps:\n1. 1 + 2 x 3 = 7\n2. 7 + 3 x 4 = 19\n3. 19 + 5 = 24",
-#     "Use these numbers and basic arithmetic to get 24 as a result:\n\nInput: [5, 9, 7]\nSteps:\n1. Add the three numbers together: 5 + 9 + 7 = 21\n2. Subtract 21 from 24: 24 - 21 = 3\n3. Multiply the remaining number by itself: 3 x 3 = 9\n4. Add the number with the multiplication result: 21 + 9 = 24",
-#    "'Use these numbers and basic arithmetic to get 24 as a result:\n\nInput: [10, 12]\nSteps:\n\n1. Add the two numbers together: 10 + 12 = 22 \n2. Subtract one of the numbers: 22 - 10 = 12 \n3. Multiply the two numbers together: 12 x 12 = 144 \n4. Divide the first number by the result: 144 / 10 = 14.4 \n5. Add the initial two numbers together again: 14.4 + 12 = 26.4 \n6. Subtract 2: 26.4 - 2 = 24",
-# ]
-```
-
-You can also ask for several samples for a single prompt:
-
-```python
-from outlines import models
-from outlines import text
-
-
-@text.prompt
-def template(input_numbers):
-    """Use these numbers and basic arithmetic to get 24 as a result:
-
-    Input: {{ input_numbers }}
-    Steps:"""
-
-
-model = models.openai("text-davinci-003")
-results = model(template([1, 2, 3]), samples=3, stop_at=["\n2"])
-print(results.shape)
-# (3,)
-
-print(results)
-# [
-#     ' \n1. Subtract 1 from 3',
-#     '\n1. Add the three numbers: 1 + 2 + 3 = 6',
-#     ' (1 + 3) x (2 + 2) = 24'
-# ]
-```
-
-Or ask for several samples for an array of prompts. In this case *the last dimension is the sample dimension*:
-
-```python
-from outlines import models
-from outlines import text
-
-
-@text.prompt
-def template(input_numbers):
-    """Use these numbers and basic arithmetic to get 24 as a result:
-
-    Input: {{ input_numbers }}
-    Steps:"""
-
-
-prompts = [template([1, 2, 3]), template([5, 9, 7]), template([10, 12])]
-
-model = models.openai("text-davinci-003")
-results = model(prompts, samples=2, stop_at=["\n2"])
-print(results.shape)
-# (3, 2)
-
-print(results)
-# [
-#     ['\n1. Add the numbers: 1 + 2 + 3 = 6', ' (3 * 2) - 1 = 5\n        5 * 4 = 20\n        20 + 4 = 24'],
-#     ['\n\n1. (5 + 9) x 7 =  56', '\n1. 5 x 9 = 45'],
-#     [' \n1. Add the two numbers together: 10 + 12 = 22', '\n1. Add 10 + 12']
-# ]
-```
-
-You may find this useful, e.g., to implement [Tree of Thoughts](https://arxiv.org/abs/2305.10601).
-
-!!! note
-
-    Outlines provides an `@outlines.vectorize` decorator that you can use on any `async` python function. This can be useful for instance when you call a remote API within your workflow.
-
-
 ## Advanced usage
 
 It is possible to specify the values for `seed`, `presence_penalty`, `frequence_penalty`, `top_p` by passing an instance of `OpenAIConfig` when initializing the model:

diff --git a/docs/reference/text.md b/docs/reference/text.md
@@ -1 +1,81 @@
 # Text generation
+
+Outlines provides a unified interface to generate text with many language models, API-based and local:
+
+```python
+from outlines import models, generate
+
+model = models.openai("gpt-4")
+generator = generate.text(model)
+answer = generator("What is 2+2?")
+
+model = models.transformers("mistralai/Mistral-7B-v0.1")
+generator = generate.text(model)
+answer = generator("What is 2+2?")
+```
+
+We generate text in two steps:
+
+1. Instantiate a generator with the model you want to use
+2. Call the generator with the prompt
+
+
+## Limit the number of tokens generated
+
+To limit the number of tokens generated you can pass the `max_tokens` positional argument to the generator:
+
+```python
+from outlines import models, generate
+
+model = models.transformers("mistralai/Mistral-7B-v0.1")
+generator = generate.text(model)
+
+answer = generator("What is 2+2?", 5)
+answer = generator("What is 2+2?", max_tokens=5)
+```
+
+## Stop when a given string is found
+
+You can also ask the model to stop generating text after a given string has been generated, for instance a period or a line break. You can pass a string or a line of string for the `stop_at` argument:
+
+
+```python
+from outlines import models, generate
+
+model = models.transformers("mistralai/Mistral-7B-v0.1")
+generator = generate.text(model)
+
+answer = generator("What is 2+2?", stop_at=".")
+answer = generator("What is 2+2?", stop_at=[".", "\n"])
+```
+
+## Streaming
+
+Outlines allows you to stream the model's response by calling the `.stream` method of the generator with the prompt:
+
+
+```python
+from outlines import models, generate
+
+model = models.transformers("mistralai/Mistral-7B-v0.1")
+generator = generate.text(model)
+
+tokens = generator.stream("What is 2+2?")
+for token in tokens:
+    print(token)
+```
+
+## Use a different sampler
+
+Outlines uses the multinomial sampler by default. To specify another sampler, for instance the greedy sampler you need to specify it when instantiating the generator:
+
+```python
+from outlines import models, generate
+from outlines.generate.samplers import greedy
+
+
+model = models.transformers("mistralai/Mistral-7B-v0.1")
+generator = generate.text(model, sampler=greedy)
+
+tokens = generator("What is 2+2?")
+```