Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve the search accuracy of MongoDBAtlasVectorSearch #26612

Open
5 tasks done
GzRichChen opened this issue Sep 18, 2024 · 0 comments
Open
5 tasks done

How to improve the search accuracy of MongoDBAtlasVectorSearch #26612

GzRichChen opened this issue Sep 18, 2024 · 0 comments
Labels
Ɑ: vector store Related to vector store module

Comments

@GzRichChen
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

embed_model = ZhipuAIEmbeddings(
model=EMBEDDING_MODEL,
api_key=MODEL_API_KEY,
)

vector_search = MongoDBAtlasVectorSearch(
collection=collection,
embedding=embed_model,
index_name=“vector_index”,
embedding_key=“embedding”,
text_key=“text”
)

llm = ChatOpenAI(
temperature=0.95,
model=MODEL_NAME,
openai_api_key=MODEL_API_KEY,
openai_api_base=OPEN_API_BASE
)

qa_retriever = vector_search.as_retriever(
search_type=“similarity”
)

def get_db_chat():
#answer = vector_search.similarity_search(“劳动法第29条”)
answer = vector_search.similarity_search_with_score(
query=“劳动法第29条”, k=10
)
print(answer)

Error Message and Stack Trace (if applicable)

The database contains the content of “中华人民共和国劳动法第二十九条”, but when searching, it is not found. How can I search accurately?

Description

The database contains the content of “中华人民共和国劳动法第二十九条”, but when searching, it is not found. How can I search accurately?

System Info

aenum==3.1.15
aiofiles==24.1.0
aiohappyeyeballs==2.3.5
aiohttp==3.10.3
aiolimiter==1.1.0
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
anytree==2.12.1
appnope==0.1.4
APScheduler==3.10.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asgiref==3.8.1
asteval==1.0.2
asttokens==2.4.1
async-lru==2.0.4
attrs==24.2.0
autograd==1.7.0
azure-common==1.1.28
azure-core==1.30.2
azure-identity==1.17.1
azure-search-documents==11.5.1
azure-storage-blob==12.22.0
babel==2.16.0
backoff==2.2.1
bcrypt==4.2.0
beartype==0.18.5
beautifulsoup4==4.12.3
bleach==6.1.0
build==1.2.1
cachetools==5.4.0
certifi==2024.7.4
cffi==1.17.0
chardet==5.2.0
charset-normalizer==3.3.2
chroma-hnswlib==0.7.6
chromadb==0.5.5
click==8.1.7
cloudpickle==3.0.0
coloredlogs==15.0.1
comm==0.2.2
contourpy==1.2.1
cramjam==2.8.3
cryptography==43.0.0
cycler==0.12.1
dask==2024.8.1
dask-expr==1.1.11
dataclasses-json==0.6.7
datashaper==0.0.49
debugpy==1.8.5
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
deprecation==2.1.0
devtools==0.12.2
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
dnspython==2.6.1
environs==11.0.0
executing==2.0.1
faiss-cpu==1.8.0.post1
fastapi==0.112.2
fastjsonschema==2.20.0
fastparquet==2024.5.0
filelock==3.15.4
flatbuffers==24.3.25
fonttools==4.53.1
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.6.1
funcsigs==1.0.2
future==1.0.0
gensim==4.3.3
google-auth==2.34.0
googleapis-common-protos==1.65.0
graphrag==0.3.1
graspologic==3.4.1
graspologic-native==1.2.1
grpcio==1.66.1
gs-quant==1.0.108
h11==0.14.0
hs-config==0.1.2
html5tagger==1.3.0
httpcore==1.0.5
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.24.6
humanfriendly==10.0
hyppo==0.4.0
idna==3.7
importlib_metadata==8.4.0
importlib_resources==6.4.4
inflection==0.5.1
ipykernel==6.29.5
ipython==8.26.0
ipywidgets==8.1.3
isodate==0.6.1
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.4
jiter==0.5.0
joblib==1.4.2
json5==0.9.25
json_repair==0.26.0
jsonpatch==1.33
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.4
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.11
kiwisolver==1.4.5
kubernetes==30.1.0
lancedb==0.11.0
langchain==0.2.13
langchain-community==0.2.12
langchain-core==0.2.30
langchain-mongodb==0.1.8
langchain-openai==0.1.21
langchain-text-splitters==0.2.2
langsmith==0.1.99
linkify-it-py==2.0.3
llvmlite==0.43.0
lmfit==1.3.2
locket==1.0.0
loguru==0.7.2
lxml==5.3.0
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.2
matplotlib-inline==0.1.7
mdit-py-plugins==0.4.1
mdurl==0.1.2
mistune==3.0.2
mmh3==4.1.0
monotonic==1.6
more-itertools==10.4.0
motor==3.1.2
mpmath==1.3.0
msal==1.30.0
msal-extensions==1.2.0
msgpack==1.0.8
multidict==6.0.5
mypy-extensions==1.0.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.3
nltk==3.9.1
notebook==7.2.1
notebook_shim==0.2.4
numba==0.60.0
numpy==1.26.4
oauthlib==3.2.2
onnxruntime==1.19.0
openai==1.40.6
opentelemetry-api==1.27.0
opentelemetry-exporter-otlp-proto-common==1.27.0
opentelemetry-exporter-otlp-proto-grpc==1.27.0
opentelemetry-instrumentation==0.48b0
opentelemetry-instrumentation-asgi==0.48b0
opentelemetry-instrumentation-fastapi==0.48b0
opentelemetry-proto==1.27.0
opentelemetry-sdk==1.27.0
opentelemetry-semantic-conventions==0.48b0
opentelemetry-util-http==0.48b0
opentracing==2.4.0
orjson==3.10.6
overrides==7.7.0
packaging==24.1
pandas==2.2.2
pandocfilters==1.5.1
parso==0.8.4
partd==1.4.2
patsy==0.5.6
pexpect==4.9.0
pillow==10.4.0
platformdirs==4.2.2
portalocker==2.10.1
posthog==3.6.0
POT==0.9.4
prometheus_client==0.20.0
prompt_toolkit==3.0.47
protobuf==4.25.4
psutil==6.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
py==1.11.0
pyaml-env==1.2.1
pyarrow==15.0.2
pyasn1==0.6.0
pyasn1_modules==0.4.0
pycparser==2.22
pydantic==2.8.2
pydantic-settings==2.3.4
pydantic_core==2.20.1
pydash==6.0.2
Pygments==2.18.0
PyJWT==2.8.0
pylance==0.15.0
pymongo==4.8.0
pynndescent==0.5.13
pyparsing==3.1.4
pypdf==4.3.1
PyPDF2==3.0.1
PyPika==0.48.9
pyproject_hooks==1.1.0
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-dotenv==1.0.1
python-json-logger==2.0.7
pytz==2024.1
PyYAML==6.0.2
pyzmq==26.1.0
qtconsole==5.5.2
QtPy==2.4.1
ratelimiter==1.2.0.post0
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
requests-oauthlib==2.0.0
retry==0.9.2
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.1
rpds-py==0.20.0
rsa==4.9
sanic==24.6.0
sanic-api==0.2.9
sanic-base-extension==0.2.0
sanic-ext==23.12.0
sanic-motor==0.7.0
sanic-routing==23.12.0
scikit-learn==1.5.1
scipy==1.12.0
seaborn==0.13.2
Send2Trash==1.8.3
shellingham==1.5.4
six==1.16.0
smart-open==7.0.4
sniffio==1.3.1
soupsieve==2.5
SQLAlchemy==2.0.32
stack-data==0.6.3
starlette==0.38.2
statsmodels==0.14.2
swifter==1.4.0
sympy==1.13.2
tenacity==8.5.0
terminado==0.18.1
textual==0.76.0
threadpoolctl==3.5.0
tiktoken==0.7.0
tinycss2==1.3.0
tokenizers==0.20.0
toolz==0.12.1
tornado==6.4.1
tqdm==4.66.5
tracerite==1.1.1
traitlets==5.14.3
twython==3.9.1
typer==0.12.5
types-python-dateutil==2.9.0.20240316
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
uc-micro-py==1.0.3
ujson==5.10.0
umap-learn==0.5.6
umongo==3.1.0
uncertainties==3.2.2
uri-template==1.3.0
urllib3==2.2.2
uvicorn==0.30.6
uvloop==0.20.0
watchfiles==0.24.0
wcwidth==0.2.13
webcolors==24.6.0
webencodings==0.5.1
websocket-client==1.8.0
websockets==12.0
widgetsnbextension==4.0.11
wrapt==1.16.0
yarl==1.9.4
zhipuai==2.1.4.20230812
zipp==3.20.0

@dosubot dosubot bot added the Ɑ: vector store Related to vector store module label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

1 participant