From 8213ab14f7cbc20fbb260c85df7b21515c3c4a44 Mon Sep 17 00:00:00 2001 From: "promptless[bot]" <179508745+promptless[bot]@users.noreply.github.com> Date: Tue, 17 Dec 2024 04:35:02 +0000 Subject: [PATCH] Docs update (75bf2f8) --- docs/docs/concepts.mdx | 2 +- docs/docs/how_to/index.mdx | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/docs/concepts.mdx b/docs/docs/concepts.mdx index 6cc0f135bff28..f73ae1f23dc48 100644 --- a/docs/docs/concepts.mdx +++ b/docs/docs/concepts.mdx @@ -1038,7 +1038,7 @@ Table columns: |----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Recursive | [RecursiveCharacterTextSplitter](/docs/how_to/recursive_text_splitter/), [RecursiveJsonSplitter](/docs/how_to/recursive_json_splitter/) | A list of user defined characters | | Recursively splits text. This splitting is trying to keep related pieces of text next to each other. This is the `recommended way` to start splitting text. | | HTML | [HTMLHeaderTextSplitter](/docs/how_to/HTML_header_metadata_splitter/), [HTMLSectionSplitter](/docs/how_to/HTML_section_aware_splitter/) | HTML specific characters | ✅ | Splits text based on HTML-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the HTML) | -| Markdown | [MarkdownHeaderTextSplitter](/docs/how_to/markdown_header_metadata_splitter/), | Markdown specific characters | ✅ | Splits text based on Markdown-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the Markdown) | +| Markdown | [MarkdownHeaderTextSplitter](/docs/how_to/markdown_header_metadata_splitter/), [ExperimentalMarkdownSyntaxTextSplitter](/docs/how_to/experimental_markdown_syntax_text_splitter/) | Markdown specific characters | ✅ | Splits text based on Markdown-specific characters. The `ExperimentalMarkdownSyntaxTextSplitter` retains the original whitespace and formatting, addressing issues with code blocks and nested lists. | | Code | [many languages](/docs/how_to/code_splitter/) | Code (Python, JS) specific characters | | Splits text based on characters specific to coding languages. 15 different languages are available to choose from. | | Token | [many classes](/docs/how_to/split_by_token/) | Tokens | | Splits text on tokens. There exist a few different ways to measure tokens. | | Character | [CharacterTextSplitter](/docs/how_to/character_text_splitter/) | A user defined character | | Splits text based on a user defined character. One of the simpler methods. | diff --git a/docs/docs/how_to/index.mdx b/docs/docs/how_to/index.mdx index b481805eaafaf..90e493727c4d1 100644 --- a/docs/docs/how_to/index.mdx +++ b/docs/docs/how_to/index.mdx @@ -134,6 +134,7 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo - [How to: split by character](/docs/how_to/character_text_splitter) - [How to: split code](/docs/how_to/code_splitter) - [How to: split Markdown by headers](/docs/how_to/markdown_header_metadata_splitter) +- [How to: split Markdown with experimental syntax retention](/docs/how_to/experimental_markdown_syntax_text_splitter) - [How to: recursively split JSON](/docs/how_to/recursive_json_splitter) - [How to: split text into semantic chunks](/docs/how_to/semantic-chunker) - [How to: split by tokens](/docs/how_to/split_by_token)