Skip to content

Commit

Permalink
chore: add README and polish docstrings (#10)
Browse files Browse the repository at this point in the history
* Update poetry.lock

* update docstring

* v1 of readme

* formatting

* explain namespaces

* add publish workflow

* grant write permission to format

* add install apache jena

* Update README.md
  • Loading branch information
shihanwan authored Oct 1, 2024
1 parent 3d7989a commit 8c707ae
Show file tree
Hide file tree
Showing 5 changed files with 579 additions and 305 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/black.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ on:
branches:
- main

permissions:
contents: write

jobs:
format:
runs-on: ubuntu-latest
Expand Down
34 changes: 34 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Publish to PyPI

on:
release:
types: [published]

jobs:
publish:
runs-on: ubuntu-latest

steps:
- name: Check out the code
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build twine
- name: Build the package
run: python -m build

- name: Publish to PyPI
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_PUBLISH_TOKEN }}
run: |
python -m twine upload --skip-existing dist/*
238 changes: 237 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,237 @@
# MemOnto
# MemOnto 🧠

`memonto` (memory + ontology) adds memory to AI agents with an emphasis on user defined ontology. Define your own [RDF](https://www.w3.org/RDF/) ontology then have `memonto` automatically extract information that maps onto that ontology.

```
┌───────────────────────────────┐ ┌──────────────────────┐ ┌───────────────────────────────────┐
│ Message │ │ LLM │ │ Memory Graph │
│ │ │ │ │ │
│ {Otto von Bismarck was a │ │ │ │ │
│ Prussian statesman and │ │ │ │ ... │
│ diplomat who oversaw the │ │ [Otto von Bismarck] │ │ │ │
│ unification of Germany...} ┼──────► │ │ │ │
│ │ │ is a [Person] who │ │ ┌───────────────▼───────────────┐ │
└───────────────────────────────┘ │ │ │ │ Otto von Bismarck │ │
┌───────────────────────────────┐ │ lives in a [Place] │ │ └────────┬──────┬───────────────┘ │
│ Ontology │ │ │ │ │ │ │
│ ┼──────► called [Prussia] ┼──────► livesAt│ │partOf │
│ ┌─────────────┐ │ │ │ │ │ │ │
│ │ Person │ │ │ and participated in │ │ ┌────────▼┐┌────▼───────────────┐ │
│ └───┬─────┬───┘ │ │ │ │ │ Prussia ││ German Unification │ │
│ │ │ │ │ an [Event] called │ │ └─┬─────┬─┘└──────┬─────┬───────┘ │
│ livesAt│ │partOf │ │ │ │ │ │ │ │ │
│ │ │ │ │ [German Unification] │ │ ▼ ▼ ▼ ▼ │
│ ┌─────────▼─┐ ┌─▼─────────┐ │ │ │ │ ... ... ... ... │
│ │ Place │ │ Event │ │ │ │ │ │
│ └───────────┘ └───────────┘ │ │ │ │ │
│ │ │ │ │ │
└───────────────────────────────┘ └──────────────────────┘ └───────────────────────────────────┘
```

## 🚀 Install
```sh
pip install memonto
```

## ⚙️ Configure
**Ephemeral Mode**

Use `memonto` all in memory without any data stores.

> [!IMPORTANT]
> When in ephemeral mode, there can be performance issues if the memory data grows too large. This mode is recommended for smaller use cases.
```python
from memonto import Memonto
from rdflib import Graph, Namespace, RDF, RDFS

g = Graph()

HIST = Namespace("history:")

g.bind("hist", HIST)

g.add((HIST.Person, RDF.type, RDFS.Class))
g.add((HIST.Event, RDF.type, RDFS.Class))
g.add((HIST.Place, RDF.type, RDFS.Class))

memonto = Memonto(
ontology=g,
namespaces={"hist": HIST},
ephemeral=True,
)
```

**With Triple Store**

A triple store enables the persistent storage of memory data. Currently supports Apache Jena Fuseki as a triple store.
```python
config = {
"triple_store": {
"provider": "apache_jena",
"config": {
"connection_url": "http://localhost:8080/",
},
},
"model": {
"provider": "openai",
"config": {
"model": "gpt-4o",
"api_key": "api-key",
},
}
}

memonto = Memonto(
ontology=g,
namespaces={"hist": HIST},
)
memonto.configure(config)
```

**With Triple + Vector Stores**

A vector store enables contextual retrieval of memory data, it must be used in conjunction with a triple store. Currently supports Chroma as a vector store.
```python
config = {
"triple_store": {
"provider": "apache_jena",
"config": {
"connection_url": "http://localhost:8080/dataset_name",
},
},
"vector_store": {
"provider": "chroma",
"config": {
"mode": "local",
"path": ".local",
},
},
"model": {
"provider": "openai",
"config": {
"model": "gpt-4o",
"api_key": "api-key",
},
}
}

memonto = Memonto(
ontology=g,
namespaces={"hist": HIST},
)
memonto.configure(config)
```

## 🧰 Usage
**RDF Namespaces**

`memonto` supports RDF namespaces as well. Just pass in a dictionary with the namespace's name along with its `rdflib.Namespace` object.
```python
memonto = Memonto(
ontology=g,
namespaces={"hist": HIST},
)
```

**Memory ID**

For when you want to associate an ontology and memories to an unique `id`.
```python
memonto = Memonto(
id="some_id_123",
ontology=g,
namespaces={"hist": HIST},
)
```

**Retain**

Extract the relevant information from a message that maps onto your ontology. It will only extract data that matches onto an entity in your ontology.
```python
memonto.retain("Otto von Bismarck was a Prussian statesman who oversaw the unification of Germany.")
```

**Recall**

Get a summary of the currently stored memories. You can provide a `context` for `memonto` to only summarize the memories that are relevant to that `context`.

> [!IMPORTANT]
> When in ephemeral mode, all memories will be returned even if a `context` is provided.
```python
# retrieve summary of memory relevant to a context
memonto.recall("Germany could unify under Prussia or Austria.")

# retrieve summary of all stored memory
memonto.recall()
```

**Retrieve**

Get the raw memory data that can be programatically accessed. Instead of a summary, get the actual stored data as a `list[dict]` that can then be manipulated in code.
> [!IMPORTANT]
> When in ephemeral mode, raw queries are not supported.
```python
# retrieve raw memory data by schema
memonto.retrieve(uri=HIST.Person)

# retrieve raw memory data by SPARQL query
memonto.retrieve(query="SELECT ?s ?p ?o WHERE {GRAPH ?g {?s ?p ?o .}}")
```

**Forget**

Forget about it.
```python
memonto.forget()
```

**Auto Expand Ontology**

Enable `memonto` to automatically expand your ontology to cover new information. If `memonto` sees new information that **does not** fit onto your ontology, it will automatically add onto your ontology to cover that new information.
```python
memonto = Memonto(
id="some_id_123",
ontology=g,
namespaces={"hist": HIST},
auto_expand=True,
)
```

## 🔀 Async Usage

All main functionalities have an async version following this function naming pattern: **a{func_name}**
```python
async def main():
await memonto.aretain("Some user query or message")
await memonto.arecall()
await memonto.aretrieve(uri=HIST.Person)
await memonto.aforget()
```

## 🔧 Additional Setup

**Apache Jena**
1. Download Apache Jena Fuseki [here](https://jena.apache.org/download/index.cgi#apache-jena-fuseki).
2. Unzip to desired folder.
```sh
tar -xzf apache-jena-fuseki-X.Y.Z.tar.gz
```
3. Run a local server.
```sh
./fuseki-server --port=8080
```


## 🔮 Current and Upcoming

| LLM | | Vector Store | |Triple Store | |
|-----------|-----|--------------|-----|-------------|-----|
|OpenAI ||Chroma ||Apache Jena ||
|Anthropic ||Pinecone |🔜 | | |
|Meta llama |🔜 |Weaviate |🔜 | | |

Feedback on what to support next is always welcomed!

## 💯 Requirements
Python 3.7 or higher.
54 changes: 14 additions & 40 deletions memonto/memonto.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ class Memonto(BaseModel):
triple_store: Optional[TripleStoreModel] = None
vector_store: Optional[VectorStoreModel] = None
auto_expand: Optional[bool] = False
auto_forget: Optional[bool] = False
ephemeral: Optional[bool] = False
debug: Optional[bool] = False
model_config = ConfigDict(arbitrary_types_allowed=True)
Expand All @@ -38,33 +37,9 @@ def init(self) -> "Memonto":

def configure(self, config: dict) -> None:
"""
Configure memonto with the desired llm model and datastore.
:param config: A dictionary containing the configuration for the LLM model and the store.
configs = {
"triple_store": {
"provider": "apache_jena",
"config": {
"connection_url": "http://localhost:3030/ds/update",
"username": "",
"password": "",
},
},
"vector_store": {
"provider": "chroma",
"config": {
"mode": "local",
"path": ".local/",
},
},
"model": {
"provider": "openai",
"config": {
"model": "gpt-4o",
"api_key": "",
},
}
}
Configure memonto with the desired llm model and data stores.
:param config: A dictionary containing the configuration for the LLM model and the data stores.
:return: None
"""
Expand All @@ -73,10 +48,9 @@ def configure(self, config: dict) -> None:
@require_config("llm", "triple_store")
def retain(self, message: str) -> None:
"""
Analyze a text for relevant information that maps onto an RDF ontology then commit them to the memory store.
Analyze a text for relevant information that maps onto an RDF ontology then add them to the memory store.
:param query: The user query that is broken down into a graph then committed to memory.
:param id[Optional]: Unique identifier for a memory. Often associated with a unique transaction or user.
:param message: The user message that is broken down into a graph then committed to memory.
:return: None
"""
Expand Down Expand Up @@ -110,18 +84,20 @@ async def aretain(self, message: str) -> None:
)

@require_config("llm", "triple_store", "vector_store")
def recall(self, message: str = None) -> str:
def recall(self, context: str = None) -> str:
"""
Return a text summary of all memories currently stored in context.
Return a text summary of either all or only relevant memories currently in the memory store. In ephemeral mode, a summary of all memories will be returned.
:param context[Optional]: Context to query the memory store for relevant memories only.
:return: A text summary of the entire current memory.
:return: A text summary of the memory.
"""
return _recall(
data=self.data,
llm=self.llm,
triple_store=self.triple_store,
vector_store=self.vector_store,
message=message,
message=context,
id=self.id,
ephemeral=self.ephemeral,
)
Expand All @@ -142,11 +118,10 @@ async def arecall(self, message: str = None) -> str:
@require_config("triple_store")
def retrieve(self, uri: URIRef = None, query: str = None) -> list:
"""
Perform query against the memory store to retrieve raw memory data rather than a summary.
Query against the memory store to retrieve raw memory data rather than a text summary. Raw queries are not supported in ephemeral mode since there are no data stores.
:param id[Optional]: Unique identifier for a memory. Often associated with a unique transaction or user.
:param uri[Optional]: URI of the entity to query for.
:param query[Optional]: Raw query that will be performed against the datastore. If you pass in a raw query then the id and uri parameters will be ignored.
:param query[Optional]: Raw query that will be performed against the memory store. If you pass in a raw query then uri will be ignored.
:return: A list of triples (subject, predicate, object).
"""
Expand Down Expand Up @@ -201,8 +176,6 @@ def remember(self) -> None:
"""
Load existing memories from the memory store to a memonto instance.
:param id[Optional]: Unique identifier for a memory. Often associated with a unique transaction or user.
:return: None.
"""
self.ontology, self.data = _remember(
Expand All @@ -224,6 +197,7 @@ def _render(
- "json": Return the graph in JSON-LD format.
- "text": Return the graph in text format.
- "image": Return the graph as a png image.
:param path: The path to save the image if format is "image".
:return: A text representation of the memory.
- "turtle" format returns a string in Turtle format.
Expand Down
Loading

0 comments on commit 8c707ae

Please sign in to comment.