Feature/code parser #11

ryanpeach · 2023-11-14T18:15:50Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Added new unit/integration tests
Added new notebook (that tests end-to-end)
I stared at the code and made sure it makes sense

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

…code-parser

…sive_scopes"] and add the current scope to the list. # Aider chat conversation: USER: Convert all instances of metadata["scopes"] to metadata["inclusive_scopes"] and add the current scope to the list. For example, if the block starts with "class Foo:" add {"name": "Foo", "type": "class"} to the list, even though it's not yet in scope, because it is defined at the start of the chunk. ASSISTANT: To implement this change, we need to modify the file `tests/node_parser/test_code_parser.py`. We will replace all instances of `metadata["scopes"]` with `metadata["inclusive_scopes"]` and add the current scope to the list. Here are the changes: ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [{'name': "Foo", 'type': "class"}] assert chunks[2].metadata["scopes"] == [{'name': "Foo", 'type': "class"}] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[2].metadata["scopes"] == [{'name': "Example", 'type': "class"}] assert chunks[3].metadata["scopes"] == [] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[2].metadata["scopes"] == [] assert chunks[3].metadata["scopes"] == [{'name': "ExampleComponent", 'type': "function"}] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [{'name': "main", 'type': "function"}] ======= assert chunks[0].metadata["inclusive_scopes"] == [{'name': "Foo", 'type': "class"}] assert chunks[1].metadata["inclusive_scopes"] == [{'name': "Foo", 'type': "class"}, {'name': "foo", 'type': "function"}] assert chunks[2].metadata["inclusive_scopes"] == [{'name': "Foo", 'type': "class"}, {'name': "baz", 'type': "function"}] assert chunks[0].metadata["inclusive_scopes"] == [] assert chunks[1].metadata["inclusive_scopes"] == [] assert chunks[0].metadata["inclusive_scopes"] == [{'name': "foo", 'type': "function"}] assert chunks[1].metadata["inclusive_scopes"] == [{'name': "baz", 'type': "function"}] assert chunks[0].metadata["inclusive_scopes"] == [{'name': "foo", 'type': "function"}] assert chunks[1].metadata["inclusive_scopes"] == [{'name': "Example", 'type': "class"}] assert chunks[2].metadata["inclusive_scopes"] == [{'name': "Example", 'type': "class"}, {'name': "exampleMethod", 'type': "function"}] assert chunks[3].metadata["inclusive_scopes"] == [{'name': "baz", 'type': "function"}] assert chunks[0].metadata["inclusive_scopes"] == [] assert chunks[1].metadata["inclusive_scopes"] == [{'name': "Person", 'type': "interface"}] assert chunks[2].metadata["inclusive_scopes"] == [{'name': "ExampleComponent", 'type': "function"}] assert chunks[3].metadata["inclusive_scopes"] == [{'name': "ExampleComponent", 'type': "function"}, {'name': "div", 'type': "tag"}] assert chunks[0].metadata["inclusive_scopes"] == [] assert chunks[1].metadata["inclusive_scopes"] == [{'name': "main", 'type': "function"}] >>>>>>> updated ``` Please note that the type of the scope (class, function, interface, tag) is assumed based on the context of the code. If the type is different, please provide the correct type. USER: InvalidEditBlock: edit failed! tests/node_parser/test_code_parser.py does not contain the *exact sequence* of HEAD lines you specified. Try again. DO NOT skip blank lines, comments, docstrings, etc! The HEAD block needs to be EXACTLY the same as the lines in tests/node_parser/test_code_parser.py with nothing missing! tests/node_parser/test_code_parser.py does not contain these 17 exact lines in a row: ``` assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [{'name': "Foo", 'type': "class"}] assert chunks[2].metadata["scopes"] == [{'name': "Foo", 'type': "class"}] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[2].metadata["scopes"] == [{'name': "Example", 'type': "class"}] assert chunks[3].metadata["scopes"] == [] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [] assert chunks[2].metadata["scopes"] == [] assert chunks[3].metadata["scopes"] == [{'name': "ExampleComponent", 'type': "function"}] assert chunks[0].metadata["scopes"] == [] assert chunks[1].metadata["scopes"] == [{'name': "main", 'type': "function"}] ``` ASSISTANT: I apologize for the confusion. I misunderstood the requirement. I will now provide the changes in smaller chunks to ensure accuracy. Here are the changes: ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[0].metadata["scopes"] == [] ======= assert chunks[0].metadata["inclusive_scopes"] == [{'name': "Foo", 'type': "class"}] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[1].metadata["scopes"] == [{'name': "Foo", 'type': "class"}] ======= assert chunks[1].metadata["inclusive_scopes"] == [{'name': "Foo", 'type': "class"}, {'name': "foo", 'type': "function"}] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[2].metadata["scopes"] == [{'name': "Foo", 'type': "class"}] ======= assert chunks[2].metadata["inclusive_scopes"] == [{'name': "Foo", 'type': "class"}, {'name': "baz", 'type': "function"}] >>>>>>> updated ``` Please apply these changes and let me know if there are any issues. I will continue to provide the remaining changes in the same manner.

…inclusive_scopes"]` in `tests/node_parser/test_code_parser.py`. # Aider chat conversation: USER: Continue with other scopes fields ASSISTANT: To replace all instances of `metadata["scopes"]` with `metadata["inclusive_scopes"]` in the file `tests/node_parser/test_code_parser.py`, we need to make changes in multiple places. Here are the changes: ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[0].metadata ======= assert "inclusive_scopes" in chunks[0].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[1].metadata ======= assert "inclusive_scopes" in chunks[1].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[2].metadata ======= assert "inclusive_scopes" in chunks[2].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[0].metadata ======= assert "inclusive_scopes" in chunks[0].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[1].metadata["scopes"] == [] ======= assert chunks[1].metadata["inclusive_scopes"] == [] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[2].metadata["scopes"] == ["html"] ======= assert chunks[2].metadata["inclusive_scopes"] == ["html"] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[3].metadata["scopes"] == ["html"] ======= assert chunks[3].metadata["inclusive_scopes"] == ["html"] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[4].metadata["scopes"] == ["html", "body"] ======= assert chunks[4].metadata["inclusive_scopes"] == ["html", "body"] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[0].metadata ======= assert "inclusive_scopes" in chunks[0].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[1].metadata["scopes"] == [] ======= assert chunks[1].metadata["inclusive_scopes"] == [] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[0].metadata ======= assert "inclusive_scopes" in chunks[0].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[1].metadata["scopes"] == [] ======= assert chunks[1].metadata["inclusive_scopes"] == [] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[2].metadata["scopes"] == [{'name': "Example", 'type': "class"}] ======= assert chunks[2].metadata["inclusive_scopes"] == [{'name': "Example", 'type': "class"}] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[3].metadata["scopes"] == [] ======= assert chunks[3].metadata["inclusive_scopes"] == [] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[0].metadata ======= assert "inclusive_scopes" in chunks[0].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[1].metadata["scopes"] == [] ======= assert chunks[1].metadata["inclusive_scopes"] == [] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[2].metadata["scopes"] == [] ======= assert chunks[2].metadata["inclusive_scopes"] == [] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[3].metadata["scopes"] == [{'name': "ExampleComponent", 'type': "function"}] ======= assert chunks[3].metadata["inclusive_scopes"] == [{'name': "ExampleComponent", 'type': "function"}] >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert "scopes" in chunks[0].metadata ======= assert "inclusive_scopes" in chunks[0].metadata >>>>>>> updated ``` ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD assert chunks[1].metadata["scopes"] == [{'name': "main", 'type': "function"}] ======= assert chunks[1].metadata["inclusive_scopes"] == [{'name': "main", 'type': "function"}] >>>>>>> updated ``` Please let me know if you need further assistance.

… scope. This will be better for CodeSplitter

…tructure and checks in `test_python_code_splitter`. # Aider chat conversation: USER: I changed around test_python_code_splitter a lot to make its tests pass. Can you examine its assertions and make the other tests have similar assertions? ASSISTANT: Sure, I can help with that. To make the other tests have similar assertions to `test_python_code_splitter`, we need to modify the following files: 1. `tests/node_parser/test_code_parser.py` Let's start with the `test_html_code_splitter` function. We need to update the assertions to match the structure and checks in `test_python_code_splitter`. The changes will include updating the `assert` statements to check for the correct `metadata` and `relationships` in each chunk. Here's the first *edit block* for `test_html_code_splitter`: ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD # Test the first chunk (DOCTYPE) assert chunks[0].text.startswith("<!DOCTYPE html>") assert chunks[0].metadata["inclusive_scopes"] == [] assert chunks[0].relationships[NodeRelationship.PARENT] is None assert chunks[0].relationships[NodeRelationship.PREVIOUS] is None assert chunks[0].relationships[NodeRelationship.NEXT] == chunks[1].id_ assert chunks[0].relationships[NodeRelationship.CHILD] == [] ======= # This is the DOCTYPE scope assert chunks[0].text.startswith("<!DOCTYPE html>") assert chunks[0].metadata["module"] == "example.html" assert chunks[0].metadata["inclusive_scopes"] == [] assert NodeRelationship.PARENT not in chunks[0].relationships assert [c.node_id for c in chunks[0].relationships[NodeRelationship.CHILD]] == [chunks[1].id_] assert isinstance(chunks[0].relationships[NodeRelationship.SOURCE], RelatedNodeInfo) assert chunks[0].relationships[NodeRelationship.SOURCE].node_id == text_node.id_ assert NodeRelationship.PREVIOUS not in chunks[0].relationships assert NodeRelationship.NEXT not in chunks[0].relationships >>>>>>> updated ``` I'll continue with the rest of the functions in the next messages.

… the structure and checks in `test_python_code_splitter`. # Aider chat conversation: USER: continue ASSISTANT: Next, let's update the assertions in the `test_typescript_code_splitter` function to match the structure and checks in `test_python_code_splitter`. The changes will include updating the `assert` statements to check for the correct `metadata` and `relationships` in each chunk. Here's the *edit block* for `test_typescript_code_splitter`: ```python tests/node_parser/test_code_parser.py <<<<<<< HEAD # Test the first chunk (function foo) assert chunks[0].text.startswith("function foo()") assert chunks[0].metadata["inclusive_scopes"] == [{'name': "foo", 'type': "function"}] assert chunks[0].relationships[NodeRelationship.PARENT] is None assert chunks[0].relationships[NodeRelationship.PREVIOUS] is None assert chunks[0].relationships[NodeRelationship.NEXT] == chunks[1].id_ assert chunks[0].relationships[NodeRelationship.CHILD] == [] # Test the second chunk (function baz) assert chunks[1].text.startswith("function baz()") assert chunks[1].metadata["inclusive_scopes"] == [{'name': "baz", 'type': "function"}] assert chunks[1].relationships[NodeRelationship.PARENT] is None assert chunks[1].relationships[NodeRelationship.PREVIOUS] == chunks[0].id_ assert chunks[1].relationships[NodeRelationship.NEXT] == None assert chunks[1].relationships[NodeRelationship.CHILD] == [] ======= # This is the first function scope assert chunks[0].text.startswith("function foo()") assert chunks[0].metadata["module"] == "example.ts" assert chunks[0].metadata["inclusive_scopes"] == [{'name': "foo", 'type': "function_definition"}] assert NodeRelationship.PARENT not in chunks[0].relationships assert chunks[0].relationships[NodeRelationship.CHILD] == [] assert isinstance(chunks[0].relationships[NodeRelationship.SOURCE], RelatedNodeInfo) assert chunks[0].relationships[NodeRelationship.SOURCE].node_id == text_node.id_ assert NodeRelationship.PREVIOUS not in chunks[0].relationships assert NodeRelationship.NEXT not in chunks[0].relationships # This is the second function scope assert chunks[1].text.startswith("function baz()") assert chunks[1].metadata["module"] == "example.ts" assert chunks[1].metadata["inclusive_scopes"] == [{'name': "baz", 'type': "function_definition"}] assert NodeRelationship.PARENT not in chunks[1].relationships assert chunks[1].relationships[NodeRelationship.CHILD] == [] assert isinstance(chunks[1].relationships[NodeRelationship.SOURCE], RelatedNodeInfo) assert chunks[1].relationships[NodeRelationship.SOURCE].node_id == text_node.id_ assert NodeRelationship.PREVIOUS not in chunks[1].relationships assert NodeRelationship.NEXT not in chunks[1].relationships >>>>>>> updated ``` I'll continue with the rest of the functions in the next messages.

… like decorators while maintaining context, saving work.

…one place for all this code cluttering the repo

… code

…ee_sitter

* Grammatical_error in docs/api_reference/query.rst * Grammatical error in docs/end_to_end_tutorials/dev_practices/e2e_evaluation.md * Changed word from unstructed to unstructured * Corrected the spelling of performing * Update evaluating.md Corrected the spelling of "guidelines" * Update evaluation.md Changed " things are working will together" to " things are working well together" * Update usage_pattern.md (#2) * Update usage_pattern.md Corrected the spellings of measure. * Update basic_strategies.md * Update (#8) * Introduce Google Generative Language Semantic Retriever (run-llama#9440) * Add a Gemini models for text and multi-modal. (run-llama#9452) * Add a Gemini models for text and multi-modal. Includes each of {text, mm} x {completion, chat} x {stream, one-shot} x {sync, async} for Gemini's Developer API. And two notebooks - regular text & multi-modal. * add gemini into doc and rename MM example file * Add version constraints to pip install And remove noisy output. * Enrich Gemini Example * cr * Use gemini-pro instead of ultra * cr * cr * cr --------- Co-authored-by: Haotian Zhang <[email protected]> * [version] bump to v0.9.15 (run-llama#9485) * use persist dir in faiss (run-llama#9486) * Gemini Embedding (run-llama#9476) * Gemini Embedding * cr * Merge Gemini Same Role Chat Message (run-llama#9498) * Fix Gemini Same Role Chat Msg * cr * cr * cr * cr * Teeny stray link causing weirdness (run-llama#9501) * Fixing CHANGELOG for wrong PR ID (run-llama#9505) * Overhauling indexing to emphasize VectorStoreIndex and SEO * Expanding use-cases docs: Q&A and chatbots * Fix small bug in string method of NodeWithScore (run-llama#9494) Update schema.py Avoid exception in __str__ method of NodeWithScore when the optional member variable self.score is None. * Added missing `default=None` to `LLM.system_prompt` (run-llama#9504) * Adding default=None to system_prompt to fix type errors * Added CHANGELOG entry * Update PDFReader (run-llama#9508) Add to PDFReader so that a user can specify if they want the PDF read in as one whole Document, or each page as a Document. * Updating Spelling Mistakes (run-llama#9513) * Grammatical_error in docs/api_reference/query.rst * Grammatical error in docs/end_to_end_tutorials/dev_practices/e2e_evaluation.md * Changed word from unstructed to unstructured * Corrected the spelling of performing --------- Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> * FIxing Grammatical issues in docs (run-llama#9514) * Fixing Docs (run-llama#9515) * Allowing `LOCALAI_DEFAULTS` to work with `__ror__` (run-llama#9502) --------- Co-authored-by: Lawrence Tsang <[email protected]> Co-authored-by: Mark McDonald <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: Logan <[email protected]> Co-authored-by: Laurie Voss <[email protected]> Co-authored-by: James Braza <[email protected]> Co-authored-by: Max <[email protected]> Co-authored-by: Alex McGraw <[email protected]> Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> * Update train_questions.txt (#5) removed a from a developing * updated basic_strategies.md and building_rag_from_scratch.md files (#6) * Update basic_strategies.md * Update building_rag_from_scratch.md * Update deprecated_terms.md , changed "THe" to "The" * Update graph_stores.md * Update guidance.md * Update vector_stores.md * This PR addresses grammatical and clarity enhancements. (#7) * Update structured_outputs.md Grammatical Improvements * Update query_transformations.md Changed followup to follow-up. * Update structured_outputs.md --------- Co-authored-by: Shorthills AI <[email protected]> * made a grammatical correction (#10) * Update train_questions.txt removed a from a developing * Update agents.md removed a from a more information * Update gatsby_license.txt (#11) * Update gatsby_license.txt grammar and tense correction: "read, understand, agree to and accept all the terms" to "read, understood, and agreed to accept all the terms", "do not agree" to "disagree", "agreement for keeping" to "agreement to keep", "unlink or detach or remove" to "unlink, detach or remove", "statements concerning tax" to "statements concerning the tax" * Update concepts.md grammar and tense correction: "or retrieved data" to "or retrieve data", "data which is" to "data that is", "class, are responsible" to "class, is responsible"," that allow you to ask question" to "that allows you to ask questions". * Update discover_llamaindex.md (run-llama#15) * Update llms.md (#9) Corrected spellings of 'results'. * Update llms.md (#13) Corrected spelling of results. * Update CHANGELOG.md (run-llama#14) fixed the spelling mistake of the evaluation * Update deprecated_terms.md Links Urls in Deprecated_terms is not working. 1. Configuring settings in the Service Context Link updated. 2. Parsing Documents into Nodes Link updated. 3. LLMs in LlamaIndex Link updated. 4. Setting LLMs in the ServiceContext updated * Update deprecated_terms.md Links Updated in Deprecated_terms mentioned below: 1. Index Usage Pattern link updated. 2. Vector Store Guide link updated 3. Vector Store Integrations link updated. 4. Index Usage Pattern link updated. 5. Vector Store Guide link updated. 6. Vector Store Integrations link updated. * Update deprecated_terms.md GPTVectorStoreIndex "Index Usage Pattern" Link updated. --------- Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> Co-authored-by: ManpreetShorthillsAI <[email protected]> Co-authored-by: SakshiShorthillsAI <[email protected]> Co-authored-by: Lawrence Tsang <[email protected]> Co-authored-by: Mark McDonald <[email protected]> Co-authored-by: Logan <[email protected]> Co-authored-by: Laurie Voss <[email protected]> Co-authored-by: James Braza <[email protected]> Co-authored-by: Max <[email protected]> Co-authored-by: Alex McGraw <[email protected]> Co-authored-by: Aayush <[email protected]> Co-authored-by: Saransh Sharma <[email protected]> Co-authored-by: AbhishekYadavShorthillsAI <[email protected]> Co-authored-by: AryamanJaiswalShorthillsAI <[email protected]> Co-authored-by: Sanskar Tanwar <[email protected]> Co-authored-by: ShreeramKumawatShorthillsAI <[email protected]> Co-authored-by: Vishal <[email protected]> Co-authored-by: AashutoshPathakShorthillsAI <[email protected]> Co-authored-by: Rajneesh Singh <[email protected]>

* Update CHANGELOG.md * Update agents.md * Updated understanding.md (run-llama#51) * Grammatical_error in docs/api_reference/query.rst * Grammatical error in docs/end_to_end_tutorials/dev_practices/e2e_evaluation.md * Changed word from unstructed to unstructured * Corrected the spelling of performing * Update evaluating.md Corrected the spelling of "guidelines" * Update evaluation.md Changed " things are working will together" to " things are working well together" * Update usage_pattern.md (#2) * Update usage_pattern.md Corrected the spellings of measure. * Update basic_strategies.md * Update (#8) * Introduce Google Generative Language Semantic Retriever (run-llama#9440) * Add a Gemini models for text and multi-modal. (run-llama#9452) * Add a Gemini models for text and multi-modal. Includes each of {text, mm} x {completion, chat} x {stream, one-shot} x {sync, async} for Gemini's Developer API. And two notebooks - regular text & multi-modal. * add gemini into doc and rename MM example file * Add version constraints to pip install And remove noisy output. * Enrich Gemini Example * cr * Use gemini-pro instead of ultra * cr * cr * cr --------- Co-authored-by: Haotian Zhang <[email protected]> * [version] bump to v0.9.15 (run-llama#9485) * use persist dir in faiss (run-llama#9486) * Gemini Embedding (run-llama#9476) * Gemini Embedding * cr * Merge Gemini Same Role Chat Message (run-llama#9498) * Fix Gemini Same Role Chat Msg * cr * cr * cr * cr * Teeny stray link causing weirdness (run-llama#9501) * Fixing CHANGELOG for wrong PR ID (run-llama#9505) * Overhauling indexing to emphasize VectorStoreIndex and SEO * Expanding use-cases docs: Q&A and chatbots * Fix small bug in string method of NodeWithScore (run-llama#9494) Update schema.py Avoid exception in __str__ method of NodeWithScore when the optional member variable self.score is None. * Added missing `default=None` to `LLM.system_prompt` (run-llama#9504) * Adding default=None to system_prompt to fix type errors * Added CHANGELOG entry * Update PDFReader (run-llama#9508) Add to PDFReader so that a user can specify if they want the PDF read in as one whole Document, or each page as a Document. * Updating Spelling Mistakes (run-llama#9513) * Grammatical_error in docs/api_reference/query.rst * Grammatical error in docs/end_to_end_tutorials/dev_practices/e2e_evaluation.md * Changed word from unstructed to unstructured * Corrected the spelling of performing --------- Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> * FIxing Grammatical issues in docs (run-llama#9514) * Fixing Docs (run-llama#9515) * Allowing `LOCALAI_DEFAULTS` to work with `__ror__` (run-llama#9502) --------- Co-authored-by: Lawrence Tsang <[email protected]> Co-authored-by: Mark McDonald <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: Logan <[email protected]> Co-authored-by: Laurie Voss <[email protected]> Co-authored-by: James Braza <[email protected]> Co-authored-by: Max <[email protected]> Co-authored-by: Alex McGraw <[email protected]> Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> * Update train_questions.txt (#5) removed a from a developing * updated basic_strategies.md and building_rag_from_scratch.md files (#6) * Update basic_strategies.md * Update building_rag_from_scratch.md * Update deprecated_terms.md , changed "THe" to "The" * Update graph_stores.md * Update guidance.md * Update vector_stores.md * This PR addresses grammatical and clarity enhancements. (#7) * Update structured_outputs.md Grammatical Improvements * Update query_transformations.md Changed followup to follow-up. * Update structured_outputs.md --------- Co-authored-by: Shorthills AI <[email protected]> * made a grammatical correction (#10) * Update train_questions.txt removed a from a developing * Update agents.md removed a from a more information * Update gatsby_license.txt (#11) * Update gatsby_license.txt grammar and tense correction: "read, understand, agree to and accept all the terms" to "read, understood, and agreed to accept all the terms", "do not agree" to "disagree", "agreement for keeping" to "agreement to keep", "unlink or detach or remove" to "unlink, detach or remove", "statements concerning tax" to "statements concerning the tax" * Update concepts.md grammar and tense correction: "or retrieved data" to "or retrieve data", "data which is" to "data that is", "class, are responsible" to "class, is responsible"," that allow you to ask question" to "that allows you to ask questions". * Update discover_llamaindex.md (run-llama#15) * Update llms.md (#9) Corrected spellings of 'results'. * Update llms.md (#13) Corrected spelling of results. * Update CHANGELOG.md (run-llama#14) fixed the spelling mistake of the evaluation * Update deprecated_terms.md Links Urls in Deprecated_terms is not working. 1. Configuring settings in the Service Context Link updated. 2. Parsing Documents into Nodes Link updated. 3. LLMs in LlamaIndex Link updated. 4. Setting LLMs in the ServiceContext updated * Update deprecated_terms.md Links Updated in Deprecated_terms mentioned below: 1. Index Usage Pattern link updated. 2. Vector Store Guide link updated 3. Vector Store Integrations link updated. 4. Index Usage Pattern link updated. 5. Vector Store Guide link updated. 6. Vector Store Integrations link updated. * Update deprecated_terms.md GPTVectorStoreIndex "Index Usage Pattern" Link updated. * Update chatbots.md Improve clarity and consistency in the Chatbots section of the documentation: 1. Corrected the term 'LLM's' to 'LLMs' for consistency. 2. Modified 'Here's some relevant resources' to 'Here are some relevant resources' for grammatical correctness. * Update chatbots.md with spelling : chatbots Previous : chat bots Corrected : chatbots * Update multimodal.md removed an extra word : "out" for consistency and clarity * Update chatbots.md * Update agents.md * Update embeddings.md * Update understanding.md * Update extraction.md --------- Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: Shorthills AI <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> Co-authored-by: ManpreetShorthillsAI <[email protected]> Co-authored-by: SakshiShorthillsAI <[email protected]> Co-authored-by: Lawrence Tsang <[email protected]> Co-authored-by: Mark McDonald <[email protected]> Co-authored-by: Logan <[email protected]> Co-authored-by: Laurie Voss <[email protected]> Co-authored-by: James Braza <[email protected]> Co-authored-by: Max <[email protected]> Co-authored-by: Alex McGraw <[email protected]> Co-authored-by: Aayush <[email protected]> Co-authored-by: Saransh Sharma <[email protected]> Co-authored-by: AbhishekYadavShorthillsAI <[email protected]> Co-authored-by: AryamanJaiswalShorthillsAI <[email protected]> Co-authored-by: Sanskar Tanwar <[email protected]> Co-authored-by: ShreeramKumawatShorthillsAI <[email protected]> Co-authored-by: Vishal <[email protected]> Co-authored-by: AashutoshPathakShorthillsAI <[email protected]> Co-authored-by: Rajneesh Singh <[email protected]> Co-authored-by: VagishaMisraShorthillsAI <[email protected]> Co-authored-by: PankajKumarShorthillsAI <[email protected]> Co-authored-by: Adarsh Shrivastav <[email protected]> Co-authored-by: Puneet Dhiman <[email protected]> Co-authored-by: AmitSinghShorthillsAI <[email protected]> * Update embeddings.md * Update embeddings.md * Update embeddings.md * Update embeddings.md --------- Co-authored-by: ChetnaGuptaShorthillsAI <[email protected]> Co-authored-by: Puneet Dhiman <[email protected]> Co-authored-by: AdityaKalraShorthillsAI <[email protected]> Co-authored-by: KamalSharmaShorthillsAI <[email protected]> Co-authored-by: Haotian Zhang <[email protected]> Co-authored-by: UpneetShorthillsAI <[email protected]> Co-authored-by: ManpreetShorthillsAI <[email protected]> Co-authored-by: SakshiShorthillsAI <[email protected]> Co-authored-by: Lawrence Tsang <[email protected]> Co-authored-by: Mark McDonald <[email protected]> Co-authored-by: Logan <[email protected]> Co-authored-by: Laurie Voss <[email protected]> Co-authored-by: James Braza <[email protected]> Co-authored-by: Max <[email protected]> Co-authored-by: Alex McGraw <[email protected]> Co-authored-by: Aayush <[email protected]> Co-authored-by: Saransh Sharma <[email protected]> Co-authored-by: AbhishekYadavShorthillsAI <[email protected]> Co-authored-by: AryamanJaiswalShorthillsAI <[email protected]> Co-authored-by: Sanskar Tanwar <[email protected]> Co-authored-by: ShreeramKumawatShorthillsAI <[email protected]> Co-authored-by: Vishal <[email protected]> Co-authored-by: AashutoshPathakShorthillsAI <[email protected]> Co-authored-by: Rajneesh Singh <[email protected]> Co-authored-by: VagishaMisraShorthillsAI <[email protected]> Co-authored-by: PankajKumarShorthillsAI <[email protected]> Co-authored-by: Adarsh Shrivastav <[email protected]> Co-authored-by: AmitSinghShorthillsAI <[email protected]>

Ryan Peach and others added 30 commits August 28, 2023 19:05

Saving work

03e1e3d

Merge branch 'main' of github.com:ryanpeach/llama_index into feature/…

a335815

…code-parser

Set up well for test driven development from here on

40720cd

changed scopes to inclusive_scopes and added the current block to the…

a86184b

… scope. This will be better for CodeSplitter

Fixed compile errors

7233af4

Got first test to work!

fa2bcc4

all tests pass (finally

d0f10f0

black

cf02f8d

Renamed to code hierarchy node parser

0abbec1

Gonna need a whole folder for these tests

16efd3b

Added back the ability to add inline comments

0c6876e

Using full string equals in tests for future skeleton tests

c662f9d

Very confusing bug when trying to develop a method for skipping types…

6097007

… like decorators while maintaining context, saving work.

This has to be right... But its not.

1dbcfac

Finally working without decorators

2b4fe38

Added tests for skeletonization

39b6b89

Added some code that may or may not work to skeletonize the nodes

2087a6b

Need to install tree sitter for testing

973c6b1

Merge branch 'main' into feature/code-parser

31ae633

Now have documentation

fac7161

Trying to meet mypy and ruff standards

1dd63b8

test, ruff, black, mypy

be3067d

Handling PR comments

4e7abc5

Merge branch 'main' into feature/code-parser

d1b6543

linting

2f6a8e8

changelog

f9e1af2

logan-markewich and others added 20 commits September 21, 2023 21:52

linting + docs

c6982cb

remove file

2a38ab5

Merge branch 'main' into feature/code-parser

2e85844

Merge branch 'main' into feature/code-parser

0aab4e0

Made an index example

e8c1408

black

f00a411

lint

378ae5c

Moved to the same file as the parser just so we don't have more than …

3132c46

…one place for all this code cluttering the repo

make format

c532aba

Ok maybe one file in the indices section overall

de34852

Added uuids on code splitting too so the llm can know to look for the…

e6552cf

… code

Added a code hierarchy string outputer

a35743e

No more improvements needed

94a9612

Missing type annotation

5ce7fd0

Cant import CodeHierarchyNodeParser by default because it requires tr…

aae6bc6

…ee_sitter

Revert "An alternative approach that doesnt use nest_asyncio"

ee4eae0

Merge branch 'onscale' into feature/code-parser

6bc906e

Deleted deprecated test_comments

e3e1ccb

make format

7f82721

import fusion retriever

089ea76

ryanpeach merged commit 5466ba0 into onscale Nov 14, 2023
5 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/code parser #11

Feature/code parser #11

ryanpeach commented Nov 14, 2023

Feature/code parser #11

Feature/code parser #11

Conversation

ryanpeach commented Nov 14, 2023

Description

Type of Change

How Has This Been Tested?

Suggested Checklist: