chore: add README and polish docstrings (#10)

* Update poetry.lock * update docstring * v1 of readme * formatting * explain namespaces * add publish workflow * grant write permission to format * add install apache jena * Update README.md
shihanwan · Oct 1, 2024 · 8c707ae · 8c707ae
1 parent 3d7989a
commit 8c707ae
Show file tree

Hide file tree

Showing 5 changed files with 579 additions and 305 deletions.
diff --git a/.github/workflows/black.yml b/.github/workflows/black.yml
@@ -5,6 +5,9 @@ on:
     branches:
       - main
 
+permissions:
+  contents: write
+
 jobs:
   format:
     runs-on: ubuntu-latest

diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -0,0 +1,34 @@
+name: Publish to PyPI
+
+on:
+  release:
+    types: [published]
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Check out the code
+        uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.x'
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install build twine
+
+      - name: Build the package
+        run: python -m build
+
+      - name: Publish to PyPI
+        env:
+          TWINE_USERNAME: __token__
+          TWINE_PASSWORD: ${{ secrets.PYPI_PUBLISH_TOKEN }}
+        run: |
+          python -m twine upload --skip-existing dist/*
+
diff --git a/README.md b/README.md
@@ -1 +1,237 @@
-# MemOnto
+# MemOnto 🧠
+
+`memonto` (memory + ontology) adds memory to AI agents with an emphasis on user defined ontology. Define your own [RDF](https://www.w3.org/RDF/) ontology then have `memonto` automatically extract information that maps onto that ontology.
+
+```
+┌───────────────────────────────┐      ┌──────────────────────┐      ┌───────────────────────────────────┐
+│  Message                      │      │ LLM                  │      │ Memory Graph                      │
+│                               │      │                      │      │                                   │
+│  {Otto von Bismarck was a     │      │                      │      │                                   │
+│   Prussian statesman and      │      │                      │      │                ...                │
+│   diplomat who oversaw the    │      │ [Otto von Bismarck]  │      │                 │                 │
+│   unification of Germany...}  ┼──────►                      │      │                 │                 │
+│                               │      │ is a [Person] who    │      │ ┌───────────────▼───────────────┐ │
+└───────────────────────────────┘      │                      │      │ │ Otto von Bismarck             │ │
+┌───────────────────────────────┐      │ lives in a [Place]   │      │ └────────┬──────┬───────────────┘ │
+│  Ontology                     │      │                      │      │          │      │                 │
+│                               ┼──────► called [Prussia]     ┼──────►   livesAt│      │partOf           │
+│        ┌─────────────┐        │      │                      │      │          │      │                 │
+│        │ Person      │        │      │ and participated in  │      │ ┌────────▼┐┌────▼───────────────┐ │
+│        └───┬─────┬───┘        │      │                      │      │ │ Prussia ││ German Unification │ │
+│            │     │            │      │ an [Event] called    │      │ └─┬─────┬─┘└──────┬─────┬───────┘ │
+│     livesAt│     │partOf      │      │                      │      │   │     │         │     │         │
+│            │     │            │      │ [German Unification] │      │   ▼     ▼         ▼     ▼         │
+│  ┌─────────▼─┐ ┌─▼─────────┐  │      │                      │      │  ...   ...       ...   ...        │
+│  │ Place     │ │ Event     │  │      │                      │      │                                   │
+│  └───────────┘ └───────────┘  │      │                      │      │                                   │
+│                               │      │                      │      │                                   │
+└───────────────────────────────┘      └──────────────────────┘      └───────────────────────────────────┘
+```
+
+## 🚀 Install
+```sh
+pip install memonto
+```
+
+## ⚙️ Configure
+**Ephemeral Mode**
+
+Use `memonto` all in memory without any data stores.
+
+> [!IMPORTANT]
+> When in ephemeral mode, there can be performance issues if the memory data grows too large. This mode is recommended for smaller use cases.
+
+```python
+from memonto import Memonto
+from rdflib import Graph, Namespace, RDF, RDFS
+
+g = Graph()
+
+HIST = Namespace("history:")
+
+g.bind("hist", HIST)
+
+g.add((HIST.Person, RDF.type, RDFS.Class))
+g.add((HIST.Event, RDF.type, RDFS.Class))
+g.add((HIST.Place, RDF.type, RDFS.Class))
+
+memonto = Memonto(
+    ontology=g,
+    namespaces={"hist": HIST},
+    ephemeral=True,
+)
+```
+
+**With Triple Store**
+
+A triple store enables the persistent storage of memory data. Currently supports Apache Jena Fuseki as a triple store.
+```python
+config = {
+    "triple_store": {
+        "provider": "apache_jena",
+        "config": {
+            "connection_url": "http://localhost:8080/",
+        },
+    },
+    "model": {
+        "provider": "openai",
+        "config": {
+            "model": "gpt-4o",
+            "api_key": "api-key",
+        },
+    }
+}
+
+memonto = Memonto(
+    ontology=g,
+    namespaces={"hist": HIST},
+)
+memonto.configure(config)
+```
+
+**With Triple + Vector Stores**
+
+A vector store enables contextual retrieval of memory data, it must be used in conjunction with a triple store. Currently supports Chroma as a vector store. 
+```python
+config = {
+    "triple_store": {
+        "provider": "apache_jena",
+        "config": {
+            "connection_url": "http://localhost:8080/dataset_name",
+        },
+    },
+    "vector_store": {
+        "provider": "chroma",
+        "config": {
+            "mode": "local", 
+            "path": ".local",
+        },
+    },
+    "model": {
+        "provider": "openai",
+        "config": {
+            "model": "gpt-4o",
+            "api_key": "api-key",
+        },
+    }
+}
+
+memonto = Memonto(
+    ontology=g,
+    namespaces={"hist": HIST},
+)
+memonto.configure(config)
+```
+
+## 🧰 Usage
+**RDF Namespaces**
+
+`memonto` supports RDF namespaces as well. Just pass in a dictionary with the namespace's name along with its `rdflib.Namespace` object.
+```python
+memonto = Memonto(
+    ontology=g,
+    namespaces={"hist": HIST},
+)
+```
+
+**Memory ID**
+
+For when you want to associate an ontology and memories to an unique `id`.
+```python
+memonto = Memonto(
+    id="some_id_123",
+    ontology=g,
+    namespaces={"hist": HIST},
+)
+```
+
+**Retain**
+
+Extract the relevant information from a message that maps onto your ontology. It will only extract data that matches onto an entity in your ontology.
+```python
+memonto.retain("Otto von Bismarck was a Prussian statesman who oversaw the unification of Germany.")
+```
+
+**Recall**
+
+Get a summary of the currently stored memories. You can provide a `context` for `memonto` to only summarize the memories that are relevant to that `context`. 
+
+> [!IMPORTANT]
+> When in ephemeral mode, all memories will be returned even if a `context` is provided.
+```python
+# retrieve summary of memory relevant to a context
+memonto.recall("Germany could unify under Prussia or Austria.")
+
+# retrieve summary of all stored memory
+memonto.recall()
+```
+
+**Retrieve**
+
+Get the raw memory data that can be programatically accessed. Instead of a summary, get the actual stored data as a `list[dict]` that can then be manipulated in code.
+> [!IMPORTANT]
+> When in ephemeral mode, raw queries are not supported.
+```python
+# retrieve raw memory data by schema
+memonto.retrieve(uri=HIST.Person)
+
+# retrieve raw memory data by SPARQL query
+memonto.retrieve(query="SELECT ?s ?p ?o WHERE {GRAPH ?g {?s ?p ?o .}}")
+```
+
+**Forget**
+
+Forget about it.
+```python
+memonto.forget()
+```
+
+**Auto Expand Ontology**
+
+Enable `memonto` to automatically expand your ontology to cover new information. If `memonto` sees new information that **does not** fit onto your ontology, it will automatically add onto your ontology to cover that new information.
+```python
+memonto = Memonto(
+    id="some_id_123",
+    ontology=g,
+    namespaces={"hist": HIST},
+    auto_expand=True,
+)
+```
+
+## 🔀 Async Usage
+
+All main functionalities have an async version following this function naming pattern: **a{func_name}**
+```python
+async def main():
+    await memonto.aretain("Some user query or message")
+    await memonto.arecall()
+    await memonto.aretrieve(uri=HIST.Person)
+    await memonto.aforget()
+```
+
+## 🔧 Additional Setup
+
+**Apache Jena**
+1. Download Apache Jena Fuseki [here](https://jena.apache.org/download/index.cgi#apache-jena-fuseki).
+2. Unzip to desired folder.
+```sh
+tar -xzf apache-jena-fuseki-X.Y.Z.tar.gz
+```
+3. Run a local server.
+```sh
+./fuseki-server --port=8080
+```
+
+
+## 🔮 Current and Upcoming
+
+| LLM       |     | Vector Store |     |Triple Store |     |
+|-----------|-----|--------------|-----|-------------|-----|
+|OpenAI     |✅   |Chroma        |✅    |Apache Jena  |✅   |
+|Anthropic  |✅   |Pinecone      |🔜    |             |     |
+|Meta llama |🔜   |Weaviate      |🔜    |             |     |
+
+Feedback on what to support next is always welcomed!
+
+## 💯 Requirements
+Python 3.7 or higher.
diff --git a/memonto/memonto.py b/memonto/memonto.py
@@ -26,7 +26,6 @@ class Memonto(BaseModel):
     triple_store: Optional[TripleStoreModel] = None
     vector_store: Optional[VectorStoreModel] = None
     auto_expand: Optional[bool] = False
-    auto_forget: Optional[bool] = False
     ephemeral: Optional[bool] = False
     debug: Optional[bool] = False
     model_config = ConfigDict(arbitrary_types_allowed=True)
@@ -38,33 +37,9 @@ def init(self) -> "Memonto":
 
     def configure(self, config: dict) -> None:
         """
-        Configure memonto with the desired llm model and datastore.
-
-        :param config: A dictionary containing the configuration for the LLM model and the store.
-            configs = {
-                "triple_store": {
-                    "provider": "apache_jena",
-                    "config": {
-                        "connection_url": "http://localhost:3030/ds/update",
-                        "username": "",
-                        "password": "",
-                    },
-                },
-                "vector_store": {
-                    "provider": "chroma",
-                    "config": {
-                        "mode": "local",
-                        "path": ".local/",
-                    },
-                },
-                "model": {
-                    "provider": "openai",
-                    "config": {
-                        "model": "gpt-4o",
-                        "api_key": "",
-                    },
-                }
-            }
+        Configure memonto with the desired llm model and data stores.
+
+        :param config: A dictionary containing the configuration for the LLM model and the data stores.
 
         :return: None
         """
@@ -73,10 +48,9 @@ def configure(self, config: dict) -> None:
     @require_config("llm", "triple_store")
     def retain(self, message: str) -> None:
         """
-        Analyze a text for relevant information that maps onto an RDF ontology then commit them to the memory store.
+        Analyze a text for relevant information that maps onto an RDF ontology then add them to the memory store.
 
-        :param query: The user query that is broken down into a graph then committed to memory.
-        :param id[Optional]: Unique identifier for a memory. Often associated with a unique transaction or user.
+        :param message: The user message that is broken down into a graph then committed to memory.
 
         :return: None
         """
@@ -110,18 +84,20 @@ async def aretain(self, message: str) -> None:
         )
 
     @require_config("llm", "triple_store", "vector_store")
-    def recall(self, message: str = None) -> str:
+    def recall(self, context: str = None) -> str:
         """
-        Return a text summary of all memories currently stored in context.
+        Return a text summary of either all or only relevant memories currently in the memory store. In ephemeral mode, a summary of all memories will be returned.
+
+        :param context[Optional]: Context to query the memory store for relevant memories only.
 
-        :return: A text summary of the entire current memory.
+        :return: A text summary of the memory.
         """
         return _recall(
             data=self.data,
             llm=self.llm,
             triple_store=self.triple_store,
             vector_store=self.vector_store,
-            message=message,
+            message=context,
             id=self.id,
             ephemeral=self.ephemeral,
         )
@@ -142,11 +118,10 @@ async def arecall(self, message: str = None) -> str:
     @require_config("triple_store")
     def retrieve(self, uri: URIRef = None, query: str = None) -> list:
         """
-        Perform query against the memory store to retrieve raw memory data rather than a summary.
+        Query against the memory store to retrieve raw memory data rather than a text summary. Raw queries are not supported in ephemeral mode since there are no data stores.
 
-        :param id[Optional]: Unique identifier for a memory. Often associated with a unique transaction or user.
         :param uri[Optional]: URI of the entity to query for.
-        :param query[Optional]: Raw query that will be performed against the datastore. If you pass in a raw query then the id and uri parameters will be ignored.
+        :param query[Optional]: Raw query that will be performed against the memory store. If you pass in a raw query then uri will be ignored.
 
         :return: A list of triples (subject, predicate, object).
         """
@@ -201,8 +176,6 @@ def remember(self) -> None:
         """
         Load existing memories from the memory store to a memonto instance.
 
-        :param id[Optional]: Unique identifier for a memory. Often associated with a unique transaction or user.
-
         :return: None.
         """
         self.ontology, self.data = _remember(
@@ -224,6 +197,7 @@ def _render(
             - "json": Return the graph in JSON-LD format.
             - "text": Return the graph in text format.
             - "image": Return the graph as a png image.
+        :param path: The path to save the image if format is "image".
 
         :return: A text representation of the memory.
             - "turtle" format returns a string in Turtle format.