Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: CypherSyntaxError when storing Property Graph in Neo4j #16880

Open
gideon772 opened this issue Nov 8, 2024 · 7 comments
Open

[Bug]: CypherSyntaxError when storing Property Graph in Neo4j #16880

gideon772 opened this issue Nov 8, 2024 · 7 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@gideon772
Copy link

Bug Description

Hi,

When attempting to store a Property Graph in a Neo4j database, I have encountered a CypherSyntaxError with the message Invalid input '(': expected "{" (line 10, column 26 (offset: 402)).

Llama Index Versions

llama-index-core=0.11.14
llama-index-embeddings-huggingface=0.3.1 
llama-index-llms-groq=0.2.0

Still, the graph was generated successfully.

Generation Status

2024-11-08 19:00:09,974 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: BAAI/bge-small-en
2024-11-08 19:00:12,065 - sentence_transformers.SentenceTransformer - INFO - 2 prompts are loaded, with the keys: ['query', 'text']
Parsing nodes: 100%|██████████| 12/12 [00:00<00:00, 627.15it/s]
Extracting paths from text:   0%|          | 0/22 [00:00<?, ?it/s]2024-11-08 19:00:16,008 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:   5%|▍         | 1/22 [00:03<01:22,  3.92s/it]2024-11-08 19:00:16,930 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:   9%|▉         | 2/22 [00:04<00:43,  2.16s/it]2024-11-08 19:00:18,275 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  14%|█▎        | 3/22 [00:06<00:33,  1.79s/it]2024-11-08 19:00:21,148 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  18%|█▊        | 4/22 [00:09<00:39,  2.22s/it]2024-11-08 19:00:28,231 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  23%|██▎       | 5/22 [00:16<01:07,  3.97s/it]2024-11-08 19:00:31,314 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  27%|██▋       | 6/22 [00:19<00:58,  3.67s/it]2024-11-08 19:00:31,428 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  32%|███▏      | 7/22 [00:19<00:37,  2.51s/it]2024-11-08 19:00:34,031 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  36%|███▋      | 8/22 [00:21<00:35,  2.54s/it]2024-11-08 19:00:36,386 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  41%|████      | 9/22 [00:24<00:32,  2.48s/it]2024-11-08 19:00:39,086 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  45%|████▌     | 10/22 [00:26<00:30,  2.55s/it]2024-11-08 19:00:39,318 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  50%|█████     | 11/22 [00:27<00:20,  1.84s/it]2024-11-08 19:00:41,096 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  55%|█████▍    | 12/22 [00:29<00:18,  1.82s/it]2024-11-08 19:00:41,463 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  59%|█████▉    | 13/22 [00:29<00:12,  1.38s/it]2024-11-08 19:00:43,611 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  64%|██████▎   | 14/22 [00:31<00:12,  1.61s/it]2024-11-08 19:00:43,825 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  68%|██████▊   | 15/22 [00:31<00:08,  1.19s/it]2024-11-08 19:00:46,692 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  73%|███████▎  | 16/22 [00:34<00:10,  1.70s/it]2024-11-08 19:00:46,707 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-11-08 19:00:47,240 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  82%|████████▏ | 18/22 [00:35<00:04,  1.04s/it]2024-11-08 19:00:54,408 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  86%|████████▋ | 19/22 [00:42<00:07,  2.56s/it]2024-11-08 19:00:55,097 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  91%|█████████ | 20/22 [00:43<00:04,  2.07s/it]2024-11-08 19:00:57,381 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text:  95%|█████████▌| 21/22 [00:45<00:02,  2.13s/it]2024-11-08 19:00:58,432 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Extracting paths from text: 100%|██████████| 22/22 [00:46<00:00,  2.11s/it]
Extracting implicit paths: 100%|██████████| 22/22 [00:00<00:00, 116655.74it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 38.36it/s]?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 220.53it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 294.56it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 173.23it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 332.78it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 180.70it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 178.85it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 316.50it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 182.04it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 181.12it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 294.63it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 270.13it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 183.17it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 206.90it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 128.12it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 186.37it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 179.32it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 178.03it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 179.48it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 208.82it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 198.95it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 206.97it/s]
Generating embeddings: 100%|██████████| 3/3 [00:00<00:00, 15.29it/s]

When I tried to generate without graph store, it worked and I saved the graph using graph_index.property_graph_store.save_networkx_graph(name=f"./KB_TEST_1.html")

Help me to solve this issue, Thanks in advance 💯

Version

llama-index-core=0.11.14

Steps to Reproduce

Steps to Reproduce

  • The documents are passed to the PropertyGraphIndex.from_documents() method as Document(text="""[{json data}]""") instances.
  • During the process of storing the generated Property Graph in a Neo4j database, the code encounters a CypherSyntaxError.

Sample Document Structure

JSON : ['{\n  "report": {\n    "internal_code": "23G18-GRI01",\n    "date": "JUL2023",\n    "company": "Gripon420 Inc",\n    "sample": "R&D-1 Milled",\n    "plant_material": "Cannabis sativa"\n  },\n  "compounds": [\n    {\n      "name": "6-Methyl-5-hepten-2-one",\n      "type": "Aliphatic ketone",\n      "percentage": 0.01\n    },\n    {\n      "name": "Myrcene",\n      "type": "Monoterpene",\n      "percentage": 1.77\n    },\n    {\n      "name": "α-Phellandrene",\n      "type": "Monoterpene",\n      "percentage": 0.02\n    },\n    {\n      "name": "Δ3-Carene",\n      "type": "Monoterpene",\n      "percentage": 0.02\n    },\n    {\n      "name": "α-Terpinene",\n      "type": "Monoterpene",\n      "percentage": 0.02\n    },\n    {\n      "name": "Hexyl acetate",\n      "type": "Aliphatic ester",\n      "percentage": 0.01\n    },\n    {\n      "name": "para-Cymene",\n      "type": "Monoterpene",\n      "percentage": 0.01\n    },\n    {\n      "name": "β-Phellandrene",\n      "type": "Monoterpene",\n      "percentage": 0.04\n    },\n    {\n      "name": "Limonene",\n      "type": "Monoterpene",\n      "percentage": 4.34\n    },\n    {\n      "name": "1,8-Cineole",\n      "type": "Monoterpenic ether",\n      "percentage": 0.01\n    },\n    {\n      "name": "Isopropyl hexanoate",\n      "type": "Aliphatic ester",\n      "percentage": "tr"\n    },\n    {\n      "name": "Benzyl alcohol",\n      "type": "Simple phenolic",\n      "percentage": "tr"\n    },\n    {\n      "name": "(Z)-β-Ocimene",\n      "type": "Monoterpene",\n      "percentage": 0.03\n    },\n    {\n      "name": "(E)-β-Ocimene",\n      "type": "Monoterpene",\n      "percentage": 0.32\n    },\n    {\n      "name": "Unknown",\n      "type": "Monoterpene",\n      "percentage": 0.01\n    },\n    {\n      "name": "γ-Terpinene",\n      "type": "Monoterpene",\n      "percentage": 0.02\n    },\n    {\n      "name": "Unknown",\n      "type": "Oxygenated monoterpene",\n      "percentage": 0.01\n    },\n    {\n      "name": "cis-Sabinene hydrate",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.03\n    },\n    {\n      "name": "cis-Linalool oxide (fur.)",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.01\n    },\n    {\n      "name": "Octanol",\n      "type": "Aliphatic alcohol",\n      "percentage": 0.01\n    },\n    {\n      "name": "Fenchone",\n      "type": "Monoterpenic ketone",\n      "percentage": 0.06\n    },\n    {\n      "name": "Terpinolene",\n      "type": "Monoterpene",\n      "percentage": 0.37\n    },\n    {\n      "name": "6,7-Epoxymyrcene",\n      "type": "Monoterpenic ether",\n      "percentage": "tr"\n    },\n    {\n      "name": "trans-Sabinene hydrate",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.02\n    },\n    {\n      "name": "Linalool",\n      "type": "Monoterpenic alcohol",\n      "percentage": 1.59\n    },\n    {\n      "name": "Unknown",\n      "type": "Unknown",\n      "percentage": 0.02\n    },\n    {\n      "name": "endo-Fenchol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.72\n    },\n    {\n      "name": "trans-Pinene hydrate",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.51\n    },\n    {\n      "name": "cis-para-Mentha-2,8-dien-1-ol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.01\n    },\n    {\n      "name": "cis-Pinene hydrate",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.08\n    },\n    {\n      "name": "Camphene hydrate",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.04\n    },\n    {\n      "name": "Ipsdienol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.04\n    },\n    {\n      "name": "Hexyl isobutyrate",\n      "type": "Aliphatic ester",\n      "percentage": 0.01\n    },\n    {\n      "name": "(E)-2,6-Dimethyl-1,5,7-octatrien-3-ol",\n      "type": "Monoterpenic alcohol",\n      "percentage": "tr"\n    },\n    {\n      "name": "Borneol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.17\n    },\n    {\n      "name": "Terpinen-4-ol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.05\n    },\n    {\n      "name": "para-Cymen-8-ol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.02\n    },\n    {\n      "name": "Myrtenal",\n      "type": "Monoterpenic aldehyde",\n      "percentage": 0.01\n    },\n    {\n      "name": "α-Terpineol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.69\n    },\n    {\n      "name": "Myrtenol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.01\n    },\n    {\n      "name": "Hexyl butyrate",\n      "type": "Aliphatic ester",\n      "percentage": 0.01\n    },\n    {\n      "name": "Unknown",\n      "type": "Unknown",\n      "percentage": "tr"\n    },\n    {\n      "name": "trans-Piperitol",\n      "type": "Monoterpenic alcohol",\n      "percentage": "tr"\n    },\n    {\n      "name": "Octyl acetate",\n      "type": "Aliphatic ester",\n      "percentage": "tr"\n    },\n    {\n      "name": "trans-Carveol",\n      "type": "Monoterpenic alcohol",\n      "percentage": 0.01\n    }\n  ]\n}']

Config

  • Database: Neo4j Cloud and Local
  • LLM: OpenAI gpt-4o and Groq llama-3.2-90b-vision-preview
  • Embedding Model: HuggingFaceEmbedding

Graph Store Intialization

try:
    logger.info("Connecting to Neo4j graph store")
    graph_store = Neo4jPropertyGraphStore(
        username=os.getenv("NEO4J_USERNAME"),
        password=os.getenv("NEO4J_PASSWORD"),
        url=os.getenv("NEO4J_URI"),
    )
except Exception as e:
    logger.error(f"Failed to connect to Neo4j graph store: {str(e)}")
    raise

Graph Generation Code

from llama_index.core.indices.property_graph import PropertyGraphIndex
from llama_index.core.indices.property_graph.transformations import SimpleLLMPathExtractor, ImplicitPathExtractor


kg_extractor = [
    SimpleLLMPathExtractor(max_paths_per_chunk=100),
    ImplicitPathExtractor()
]

graph_index = PropertyGraphIndex.from_documents(
    documents, 
    show_progress=True,
    embed_model=HuggingFaceEmbedding(),
    kg_extractors=kg_extractor,
    llm=llm,
    property_graph_store=graph_store)

Relevant Logs/Tracbacks

{
	"name": "CypherSyntaxError",
	"message": "{code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input '(': expected \"{\" (line 10, column 26 (offset: 402))
\"                    CALL (e, row) {\"
                          ^}",
	"stack": "---------------------------------------------------------------------------
CypherSyntaxError                         Traceback (most recent call last)
Cell In[14], line 10
      2 from llama_index.core.indices.property_graph.transformations import SimpleLLMPathExtractor, ImplicitPathExtractor
      5 kg_extractor = [
      6     SimpleLLMPathExtractor(max_paths_per_chunk=100),
      7     ImplicitPathExtractor()
      8 ]
---> 10 graph_index = PropertyGraphIndex.from_documents(
     11     documents, 
     12     show_progress=True,
     13     embed_model=HuggingFaceEmbedding(),
     14     kg_extractors=kg_extractor,
     15     llm=llm,
     16     property_graph_store=graph_store)
     18 # graph_index.property_graph_store.save_networkx_graph(name=f\"./KB_TEST_1.html\")

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/core/indices/base.py:119, in BaseIndex.from_documents(cls, documents, storage_context, show_progress, callback_manager, transformations, **kwargs)
    110     docstore.set_document_hash(doc.get_doc_id(), doc.hash)
    112 nodes = run_transformations(
    113     documents,  # type: ignore
    114     transformations,
    115     show_progress=show_progress,
    116     **kwargs,
    117 )
--> 119 return cls(
    120     nodes=nodes,
    121     storage_context=storage_context,
    122     callback_manager=callback_manager,
    123     show_progress=show_progress,
    124     transformations=transformations,
    125     **kwargs,
    126 )

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/core/indices/property_graph/base.py:134, in PropertyGraphIndex.__init__(self, nodes, llm, kg_extractors, property_graph_store, vector_store, use_async, embed_model, embed_kg_nodes, callback_manager, transformations, storage_context, show_progress, **kwargs)
    128 self._embed_kg_nodes = embed_kg_nodes
    129 self._override_vector_store = (
    130     vector_store is not None
    131     or not storage_context.property_graph_store.supports_vector_queries
    132 )
--> 134 super().__init__(
    135     nodes=nodes,
    136     callback_manager=callback_manager,
    137     storage_context=storage_context,
    138     transformations=transformations,
    139     show_progress=show_progress,
    140     **kwargs,
    141 )

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/core/indices/base.py:77, in BaseIndex.__init__(self, nodes, objects, index_struct, storage_context, callback_manager, transformations, show_progress, **kwargs)
     75 if index_struct is None:
     76     nodes = nodes or []
---> 77     index_struct = self.build_index_from_nodes(
     78         nodes + objects,  # type: ignore
     79         **kwargs,  # type: ignore
     80     )
     81 self._index_struct = index_struct
     82 self._storage_context.index_store.add_index_struct(self._index_struct)

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/core/indices/base.py:185, in BaseIndex.build_index_from_nodes(self, nodes, **build_kwargs)
    183 \"\"\"Build the index from nodes.\"\"\"
    184 self._docstore.add_documents(nodes, allow_update=True)
--> 185 return self._build_index_from_nodes(nodes, **build_kwargs)

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/core/indices/property_graph/base.py:334, in PropertyGraphIndex._build_index_from_nodes(self, nodes, **build_kwargs)
    330 def _build_index_from_nodes(
    331     self, nodes: Optional[Sequence[BaseNode]], **build_kwargs: Any
    332 ) -> IndexLPG:
    333     \"\"\"Build index from nodes.\"\"\"
--> 334     nodes = self._insert_nodes(nodes or [])
    336     # this isn't really used or needed
    337     return IndexLPG()

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/core/indices/property_graph/base.py:296, in PropertyGraphIndex._insert_nodes(self, nodes)
    293     self.property_graph_store.upsert_llama_nodes(nodes)
    295 if len(kg_nodes_to_insert) > 0:
--> 296     self.property_graph_store.upsert_nodes(kg_nodes_to_insert)
    298 # important: upsert relations after nodes
    299 if len(kg_rels_to_insert) > 0:

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/graph_stores/neo4j/neo4j_property_graph.py:334, in Neo4jPropertyGraphStore.upsert_nodes(self, nodes)
    332 for index in range(0, len(entity_dicts), CHUNK_SIZE):
    333     chunked_params = entity_dicts[index : index + CHUNK_SIZE]
--> 334     self.structured_query(
    335         f\"\"\"
    336         UNWIND $data AS row
    337         MERGE (e:{BASE_NODE_LABEL} {{id: row.id}})
    338         SET e += apoc.map.clean(row.properties, [], [])
    339         SET e.name = row.name, e:`{BASE_ENTITY_LABEL}`
    340         WITH e, row
    341         CALL apoc.create.addLabels(e, [row.label])
    342         YIELD node
    343         WITH e, row
    344         CALL (e, row) {{
    345             WITH e, row
    346             WHERE row.embedding IS NOT NULL
    347             CALL db.create.setNodeVectorProperty(e, 'embedding', row.embedding)
    348             RETURN count(*) AS count
    349         }}
    350         WITH e, row WHERE row.properties.triplet_source_id IS NOT NULL
    351         MERGE (c:{BASE_NODE_LABEL} {{id: row.properties.triplet_source_id}})
    352         MERGE (e)<-[:MENTIONS]-(c)
    353         \"\"\",
    354         param_map={\"data\": chunked_params},
    355     )

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/llama_index/graph_stores/neo4j/neo4j_property_graph.py:587, in Neo4jPropertyGraphStore.structured_query(self, query, param_map)
    585 param_map = param_map or {}
    586 try:
--> 587     data, _, _ = self._driver.execute_query(
    588         query, database_=self._database, parameters_=param_map
    589     )
    590     full_result = [d.data() for d in data]
    592     if self.sanitize_query_output:

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/driver.py:971, in Driver.execute_query(self, query_, parameters_, routing_, database_, impersonated_user_, bookmark_manager_, auth_, result_transformer_, **kwargs)
    967     raise ValueError(
    968         f\"Invalid routing control value: {routing_!r}\"
    969     )
    970 with session._pipelined_begin:
--> 971     return session._run_transaction(
    972         access_mode,
    973         TelemetryAPI.DRIVER,
    974         work,
    975         (query_str, parameters, result_transformer_),
    976         {},
    977     )

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/work/session.py:574, in Session._run_transaction(self, access_mode, api, transaction_function, args, kwargs)
    572 tx = self._transaction
    573 try:
--> 574     result = transaction_function(tx, *args, **kwargs)
    575 except asyncio.CancelledError:
    576     # if cancellation callback has not been called yet:
    577     if self._transaction is not None:

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/driver.py:1307, in _work(tx, query, parameters, transformer)
   1301 def _work(
   1302     tx: ManagedTransaction,
   1303     query: te.LiteralString,
   1304     parameters: dict[str, t.Any],
   1305     transformer: t.Callable[[Result], t.Union[_T]],
   1306 ) -> _T:
-> 1307     res = tx.run(query, parameters)
   1308     return transformer(res)

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/work/transaction.py:195, in TransactionBase.run(self, query, parameters, **kwparameters)
    192 self._results.append(result)
    194 parameters = dict(parameters or {}, **kwparameters)
--> 195 result._tx_ready_run(query, parameters)
    197 return result

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/work/result.py:175, in Result._tx_ready_run(self, query, parameters)
    171 def _tx_ready_run(self, query, parameters):
    172     # BEGIN+RUN does not carry any extra on the RUN message.
    173     # BEGIN {extra}
    174     # RUN \"query\" {parameters} {extra}
--> 175     self._run(query, parameters, None, None, None, None, None, None)

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/work/result.py:231, in Result._run(self, query, parameters, db, imp_user, access_mode, bookmarks, notifications_min_severity, notifications_disabled_classifications)
    229 self._pull()
    230 self._connection.send_all()
--> 231 self._attach()

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/work/result.py:425, in Result._attach(self)
    423 if self._exhausted is False:
    424     while self._attached is False:
--> 425         self._connection.fetch_message()

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/io/_common.py:181, in ConnectionErrorHandler.__getattr__.<locals>.outer.<locals>.inner(*args, **kwargs)
    179 def inner(*args, **kwargs):
    180     try:
--> 181         func(*args, **kwargs)
    182     except (Neo4jError, ServiceUnavailable, SessionExpired) as exc:
    183         assert not asyncio.iscoroutinefunction(self.__on_error)

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/io/_bolt.py:977, in Bolt.fetch_message(self)
    973 # Receive exactly one message
    974 tag, fields = self.inbox.pop(
    975     hydration_hooks=self.responses[0].hydration_hooks
    976 )
--> 977 res = self._process_message(tag, fields)
    978 self.idle_since = monotonic()
    979 return res

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/io/_bolt5.py:466, in Bolt5x0._process_message(self, tag, fields)
    464 self._server_state_manager.state = self.bolt_states.FAILED
    465 try:
--> 466     response.on_failure(summary_metadata or {})
    467 except (ServiceUnavailable, DatabaseUnavailable):
    468     if self.pool:

File ~/.virtualenvs/ameya-service-database/lib/python3.11/site-packages/neo4j/_sync/io/_common.py:251, in Response.on_failure(self, metadata)
    249 handler = self.handlers.get(\"on_summary\")
    250 Util.callback(handler)
--> 251 raise Neo4jError.hydrate(**metadata)

CypherSyntaxError: {code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input '(': expected \"{\" (line 10, column 26 (offset: 402))
\"                    CALL (e, row) {\"
                          ^}"
}
@gideon772 gideon772 added bug Something isn't working triage Issue needs to be triaged/prioritized labels Nov 8, 2024
Copy link

dosubot bot commented Nov 8, 2024

Hey @gideon772! I'm here to help you with any bugs, questions, or contributions you have. Let me know how I can assist you!

A similar issue was reported in the LlamaIndex repository, where a CypherSyntaxError occurred due to outdated Cypher syntax when using the latest Neo4j Docker container. The solution involved updating the Cypher query syntax to be compatible with the latest Neo4j version [1].

In your case, the error message indicates a syntax issue with the CALL clause in your Cypher query. The error suggests that the syntax used is not supported by the version of Neo4j you are using. To resolve this, you should ensure that your Neo4j version is 5.11.0 or greater, as this version supports the required features for vector indexing [2].

Here's a general approach to address the issue:

  1. Update Neo4j Version: Ensure that your Neo4j database is running version 5.11.0 or later. This version supports the necessary syntax and features.

  2. Modify Cypher Query: Review the Cypher query in your Neo4jPropertyGraphStore class, specifically the part where the CALL clause is used. Ensure that it follows the correct syntax for your Neo4j version. You might need to replace or adjust the CALL clause to match the supported syntax.

  3. Check for APOC Procedures: If you are using APOC procedures, ensure that they are correctly installed and configured in your Neo4j instance. This can be done by enabling the APOC plugin in your Neo4j configuration.

By following these steps, you should be able to resolve the CypherSyntaxError and successfully store your Property Graph in the Neo4j database.

To continue talking to Dosu, mention @dosu.

@gideon772
Copy link
Author

@dosu I'm using 5.20.0, then what is the issue, Should I need to update llama index version?

Copy link

dosubot bot commented Nov 8, 2024

Updating the LlamaIndex version might help resolve the issue, especially if there have been recent updates or bug fixes related to Neo4j integration. For instance, there have been updates in versions after 0.11.14 that address Neo4j integration issues, such as removing warnings from the dbms server in Neo4j queries and making Neo4j schema refresh configurable [1].

Consider updating to a newer version of LlamaIndex to see if it resolves the CypherSyntaxError. Additionally, ensure that the Cypher query syntax used in your application is compatible with Neo4j 5.20.0, as syntax changes between versions can sometimes cause issues.

To continue talking to Dosu, mention @dosu.

@gideon772
Copy link
Author

@dosu I think this query is the issue can you check and change

Query

self.structured_query(
                    f"""
                    UNWIND $data AS row
                    MERGE (e:{BASE_NODE_LABEL} {{id: row.id}})
                    SET e += apoc.map.clean(row.properties, [], [])
                    SET e.name = row.name, e:`{BASE_ENTITY_LABEL}`
                    WITH e, row
                    CALL apoc.create.addLabels(e, [row.label])
                    YIELD node
                    WITH e, row
                    CALL (e, row) {{
                        WITH e, row
                        WHERE row.embedding IS NOT NULL
                        CALL db.create.setNodeVectorProperty(e, 'embedding', row.embedding)
                        RETURN count(*) AS count
                    }}
                    WITH e, row WHERE row.properties.triplet_source_id IS NOT NULL
                    MERGE (c:{BASE_NODE_LABEL} {{id: row.properties.triplet_source_id}})
                    MERGE (e)<-[:MENTIONS]-(c)
                    """,
                    param_map={"data": chunked_params},
                )

Copy link

dosubot bot commented Nov 8, 2024

The issue in your query is with the CALL clause syntax. The error message indicates that the syntax used for the CALL clause is incorrect. In Neo4j, the CALL clause should not be used with parentheses in this context. Here's a revised version of your query:

self.structured_query(
    f"""
    UNWIND $data AS row
    MERGE (e:{BASE_NODE_LABEL} {{id: row.id}})
    SET e += apoc.map.clean(row.properties, [], [])
    SET e.name = row.name, e:`{BASE_ENTITY_LABEL}`
    WITH e, row
    CALL apoc.create.addLabels(e, [row.label])
    YIELD node
    WITH e, row
    WHERE row.embedding IS NOT NULL
    CALL db.create.setNodeVectorProperty(e, 'embedding', row.embedding)
    RETURN count(*) AS count
    WITH e, row WHERE row.properties.triplet_source_id IS NOT NULL
    MERGE (c:{BASE_NODE_LABEL} {{id: row.properties.triplet_source_id}})
    MERGE (e)<-[:MENTIONS]-(c)
    """,
    param_map={"data": chunked_params},
)

In this corrected version, the CALL clause is used directly without parentheses, and the WHERE clause is placed before the CALL to filter the rows appropriately. This should resolve the CypherSyntaxError you are encountering.

To continue talking to Dosu, mention @dosu.

@gideon772
Copy link
Author

@dosu when I use this query getting
CypherSyntaxError: {code: Neo.ClientError.Statement.SyntaxError} {message: Variable enot defined (line 13, column 26 (offset: 587)) "WITH e, row WHERE row.properties.triplet_source_id IS NOT NULL"^}error.

Copy link

dosubot bot commented Nov 8, 2024

The error you're encountering indicates that the variable e is not defined at the point where it's being used in the WITH clause. This is likely because the RETURN statement before it ends the scope of the variables. To fix this, you should ensure that the WITH clause carries forward the necessary variables before the RETURN statement. Here's a revised version of your query:

self.structured_query(
    f"""
    UNWIND $data AS row
    MERGE (e:{BASE_NODE_LABEL} {{id: row.id}})
    SET e += apoc.map.clean(row.properties, [], [])
    SET e.name = row.name, e:`{BASE_ENTITY_LABEL}`
    WITH e, row
    CALL apoc.create.addLabels(e, [row.label])
    YIELD node
    WITH e, row
    WHERE row.embedding IS NOT NULL
    CALL db.create.setNodeVectorProperty(e, 'embedding', row.embedding)
    WITH e, row  // Add this line to carry forward the variables
    RETURN count(*) AS count
    WITH e, row WHERE row.properties.triplet_source_id IS NOT NULL
    MERGE (c:{BASE_NODE_LABEL} {{id: row.properties.triplet_source_id}})
    MERGE (e)<-[:MENTIONS]-(c)
    """,
    param_map={"data": chunked_params},
)

By adding a WITH e, row before the RETURN statement, you ensure that these variables are available for the subsequent operations. This should resolve the Variable 'e' not defined error.

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

1 participant