Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aadd_documents() over milvus fails with RecursionError #28727

Open
5 tasks done
MichaelSkralivetsky opened this issue Dec 15, 2024 · 2 comments
Open
5 tasks done

aadd_documents() over milvus fails with RecursionError #28727

MichaelSkralivetsky opened this issue Dec 15, 2024 · 2 comments
Labels
Ɑ: vector store Related to vector store module

Comments

@MichaelSkralivetsky
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_milvus import Milvus
from langchain_core.embeddings import DeterministicFakeEmbedding
embeddings = DeterministicFakeEmbedding(size=4096)
vectorstore_dvbrby = Milvus(collection_name="kozizbzo", embedding_function=embeddings, connection_args={'uri': 'http://ip:port'})

from uuid import uuid4

from langchain_core.documents import Document

document_1 = Document(
    page_content="I had chocalate chip pancakes and scrambled eggs for breakfast this morning.",
    metadata={"source": "tweet"},
)

documents = [
    document_1
]
uuids = [str(uuid4()) for _ in range(len(documents))]

import asyncio
async def add_documents_async(documents):
    await vectorstore_dvbrby.aadd_documents(documents, ids=uuids)

task = asyncio.create_task(add_documents_async(documents))
await task

### Error Message and Stack Trace (if applicable)

RecursionError Traceback (most recent call last)
Cell In[8], line 6
3 await vectorstore_dvbrby.aadd_documents(documents, ids=uuids)
5 task = asyncio.create_task(add_documents_async(documents))
----> 6 await task

Cell In[8], line 3, in add_documents_async(documents)
2 async def add_documents_async(documents):
----> 3 await vectorstore_dvbrby.aadd_documents(documents, ids=uuids)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_milvus/vectorstores/milvus.py:1557, in Milvus.aadd_documents(self, documents, **kwargs)
1555 texts = [doc.page_content for doc in documents]
1556 metadatas = [doc.metadata for doc in documents]
-> 1557 return await self.aadd_texts(texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/vectorstores/base.py:258, in VectorStore.aadd_texts(self, texts, metadatas, ids, **kwargs)
252 ids_: Iterator[Optional[str]] = iter(ids) if ids else cycle([None])
254 docs = [
255 Document(id=id_, page_content=text, metadata=metadata_)
256 for text, metadata_, id_ in zip(texts, metadatas_, ids_)
257 ]
--> 258 return await self.aadd_documents(docs, **kwargs)
259 return await run_in_executor(None, self.add_texts, texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_milvus/vectorstores/milvus.py:1557, in Milvus.aadd_documents(self, documents, **kwargs)
1555 texts = [doc.page_content for doc in documents]
1556 metadatas = [doc.metadata for doc in documents]
-> 1557 return await self.aadd_texts(texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/vectorstores/base.py:258, in VectorStore.aadd_texts(self, texts, metadatas, ids, **kwargs)
252 ids_: Iterator[Optional[str]] = iter(ids) if ids else cycle([None])
254 docs = [
255 Document(id=id_, page_content=text, metadata=metadata_)
256 for text, metadata_, id_ in zip(texts, metadatas_, ids_)
257 ]
--> 258 return await self.aadd_documents(docs, **kwargs)
259 return await run_in_executor(None, self.add_texts, texts, metadatas, **kwargs)

[... skipping similar frames: Milvus.aadd_documents at line 1557 (1487 times), VectorStore.aadd_texts at line 258 (1486 times)]

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/vectorstores/base.py:258, in VectorStore.aadd_texts(self, texts, metadatas, ids, **kwargs)
252 ids_: Iterator[Optional[str]] = iter(ids) if ids else cycle([None])
254 docs = [
255 Document(id=id_, page_content=text, metadata=metadata_)
256 for text, metadata_, id_ in zip(texts, metadatas_, ids_)
257 ]
--> 258 return await self.aadd_documents(docs, **kwargs)
259 return await run_in_executor(None, self.add_texts, texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_milvus/vectorstores/milvus.py:1557, in Milvus.aadd_documents(self, documents, **kwargs)
1555 texts = [doc.page_content for doc in documents]
1556 metadatas = [doc.metadata for doc in documents]
-> 1557 return await self.aadd_texts(texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/vectorstores/base.py:254, in VectorStore.aadd_texts(self, texts, metadatas, ids, **kwargs)
251 metadatas_ = iter(metadatas) if metadatas else cycle([{}])
252 ids_: Iterator[Optional[str]] = iter(ids) if ids else cycle([None])
--> 254 docs = [
255 Document(id=id_, page_content=text, metadata=metadata_)
256 for text, metadata_, id_ in zip(texts, metadatas_, ids_)
257 ]
258 return await self.aadd_documents(docs, **kwargs)
259 return await run_in_executor(None, self.add_texts, texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/vectorstores/base.py:255, in (.0)
251 metadatas_ = iter(metadatas) if metadatas else cycle([{}])
252 ids_: Iterator[Optional[str]] = iter(ids) if ids else cycle([None])
254 docs = [
--> 255 Document(id=id_, page_content=text, metadata=metadata_)
256 for text, metadata_, id_ in zip(texts, metadatas_, ids_)
257 ]
258 return await self.aadd_documents(docs, **kwargs)
259 return await run_in_executor(None, self.add_texts, texts, metadatas, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/documents/base.py:285, in Document.init(self, page_content, **kwargs)
282 """Pass page_content in as positional or named arg."""
283 # my-py is complaining that page_content is not defined on the base class.
284 # Here, we're relying on pydantic base class to handle the validation.
--> 285 super().init(page_content=page_content, **kwargs)

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/load/serializable.py:125, in Serializable.init(self, *args, **kwargs)
123 def init(self, *args: Any, **kwargs: Any) -> None:
124 """"""
--> 125 super().init(*args, **kwargs)

[... skipping hidden 1 frame]

File ~/.pythonlibs/mlrun-base/lib/python3.9/site-packages/langchain_core/documents/base.py:47, in BaseMedia.cast_id_to_str(cls, id_value)
44 @field_validator("id", mode="before")
45 def cast_id_to_str(cls, id_value: Any) -> Optional[str]:
46 if id_value is not None:
---> 47 return str(id_value)
48 else:
49 return id_value

RecursionError: maximum recursion depth exceeded while calling a Python object


### Description

RecursionError is not expected. 

### System Info

System Information

OS: Linux
OS Version: #1 SMP Wed Sep 11 18:02:00 EDT 2024
Python Version: 3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:33:10)
[GCC 12.3.0]

Package Information

langchain_core: 0.3.25
langchain: 0.3.12
langchain_community: 0.3.12
langsmith: 0.1.129
langchain_chroma: 0.2.0
langchain_milvus: 0.1.7
langchain_text_splitters: 0.3.3

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.10.6
async-timeout: 4.0.3
chromadb: 0.5.23
dataclasses-json: 0.6.7
fastapi: 0.115.5
httpx: 0.27.2
httpx-sse: 0.4.0
jsonpatch: 1.33
numpy: 1.26.4
orjson: 3.10.7
packaging: 24.0
pydantic: 2.10.3
pydantic-settings: 2.6.1
pymilvus: 2.5.0
PyYAML: 6.0.2
requests: 2.32.3
SQLAlchemy: 1.4.54
tenacity: 9.0.0
typing-extensions: 4.12.2

@keenborder786
Copy link
Contributor

@MichaelSkralivetsky, take care of in the following PR: langchain-ai/langchain-milvus#29

zc277584121 pushed a commit to langchain-ai/langchain-milvus that referenced this issue Dec 19, 2024
A `RecursionError` was being thrown because `aadd_document` was calling
`aadd_texts` without any changes in its implementation from
`add_documents`. To resolve this, `aadd_document` was removed,
preventing the `RecursionError`, and the base method from `VectorStore`
is now used instead.

Takes care of the following issue:
langchain-ai/langchain#28727
@zc277584121
Copy link
Contributor

merged the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

3 participants