Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]支持使用大模型的分段能力对文档进行自动拆分 #1213

Closed
jstbw opened this issue Sep 19, 2024 · 1 comment
Closed
Assignees

Comments

@jstbw
Copy link

jstbw commented Sep 19, 2024

MaxKB 版本

1.5.1

请描述您的需求或者改进建议

当前Maxkb导入文档的智能分段拆分的效果很差,经常出现一下子就是一大段内容,没有分段的意愿,特别是QA问答,也不能自动识别。
fastgpt拆分可以利用大模型,拆分的效果很好
2cb45c05635a8c38e802620421dd7d9
814a7aed2de8d32912318804a3f15b5
Maxkb则是利用makedown,文章是word效果就很差
416eaf9fe74e200cd5955d0968ae9a8
或者用高级分段,不便利
c007b1c81553c595804cdb834211739
知识库维护是会给业务部门进行使用、维护,他们不懂这个,更不会去维护文档结构,所以希望能利用大模型的拆分能力进行文档的自动拆分

请描述你建议的实现方案

No response

附加信息

No response

@baixin513
Copy link
Contributor

感谢反馈, word文档会按照标题格式进行分段,如果标题格式不规范,分段效果可能就不太理想,还需要用户自行规范一下文档格式。目前也是支持QA问答对导入的,需要你按照官方模板整理数据。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants