Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pr@main@fix bugs #29

Closed
wants to merge 19 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
44523da
fix: Ollama 添加未下载的模型,下载进度会由 100-0-100
shaohuzhang1 Mar 29, 2024
c893cd1
fix: 处理优化问题为空字符串的情况下,使用原始问题进行问答
shaohuzhang1 Mar 29, 2024
8f4fb81
fix: 创建用户 登录密码使用默认密码MaxKB@123.. 提示规则不对。
shaohuzhang1 Mar 29, 2024
68f6d83
fix: 【应用】从应用中的【对话日志】处保存内容报错
shaohuzhang1 Mar 29, 2024
44e7d0f
Merge branch 'main' of github.com:1Panel-dev/MaxKB into pr@main@fix_bugs
shaohuzhang1 Mar 29, 2024
c6f28a8
fix: 刷新访问链接增加确认提示
wangdan-fit2cloud Mar 29, 2024
1d3f6fb
fix: 【应用】浮窗模式的 放大按钮与关闭按钮未对齐
shaohuzhang1 Mar 29, 2024
8afece9
fix: merge时被误删的代码
shaohuzhang1 Mar 29, 2024
e4ed28c
fix: mac电脑输入法状态下按回车会发送对话
shaohuzhang1 Mar 29, 2024
89a1f3a
fix: 上传组件更新
wangdan-fit2cloud Apr 1, 2024
ea74c00
Merge branch 'main' into pr@main@fix_bugs
shaohuzhang1 Apr 1, 2024
e7ada00
fix: 优化word分段规则
shaohuzhang1 Apr 1, 2024
a261f3d
fix: 去除标题特殊字符
shaohuzhang1 Apr 1, 2024
337a772
fix: 对话重新生成问题
wangdan-fit2cloud Apr 1, 2024
c985cc3
fix: 数字格式处理
wangdan-fit2cloud Apr 1, 2024
22e7145
fix: 对话异常
shaohuzhang1 Apr 2, 2024
5e9e17d
fix: 刷新公共访问链接后,客户端统计重置
shaohuzhang1 Apr 2, 2024
548ebc2
fix: 导出未提交的sql文件
shaohuzhang1 Apr 2, 2024
ae0484a
fix: 创建 MaxKB 在线文档的知识库,只能获取根地址数据,子地址数据无法获取
shaohuzhang1 Apr 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions apps/application/serializers/application_serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,15 +209,16 @@ def auth(self, request, with_valid=True):
access_token = self.data.get("access_token")
application_access_token = QuerySet(ApplicationAccessToken).filter(access_token=access_token).first()
if application_access_token is not None and application_access_token.is_active:
if token is None or (token_details is not None and 'client_id' not in token_details) or (
token_details is not None and token_details.get(
'access_token') != application_access_token.access_token):
if token_details is not None and 'client_id' in token_details and token_details.get(
'client_id') is not None:
client_id = token_details.get('client_id')
else:
client_id = str(uuid.uuid1())
token = signing.dumps({'application_id': str(application_access_token.application_id),
'user_id': str(application_access_token.application.user.id),
'access_token': application_access_token.access_token,
'type': AuthenticationType.APPLICATION_ACCESS_TOKEN.value,
'client_id': client_id})
token = signing.dumps({'application_id': str(application_access_token.application_id),
'user_id': str(application_access_token.application.user.id),
'access_token': application_access_token.access_token,
'type': AuthenticationType.APPLICATION_ACCESS_TOKEN.value,
'client_id': client_id})
return token
else:
raise NotFound404(404, "无效的access_token")
Expand Down
37 changes: 37 additions & 0 deletions apps/application/sql/export_application_chat.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
SELECT
application_chat."id" as chat_id,
application_chat.abstract as abstract,
application_chat_record_temp.problem_text as problem_text,
application_chat_record_temp.answer_text as answer_text,
application_chat_record_temp.message_tokens as message_tokens,
application_chat_record_temp.answer_tokens as answer_tokens,
application_chat_record_temp.run_time as run_time,
application_chat_record_temp.details::JSON as details,
application_chat_record_temp."index" as "index",
application_chat_record_temp.improve_paragraph_list as improve_paragraph_list,
application_chat_record_temp.vote_status as vote_status,
application_chat_record_temp.create_time as create_time
FROM
application_chat application_chat
LEFT JOIN (
SELECT COUNT
( "id" ) AS chat_record_count,
SUM ( CASE WHEN "vote_status" = '0' THEN 1 ELSE 0 END ) AS star_num,
SUM ( CASE WHEN "vote_status" = '1' THEN 1 ELSE 0 END ) AS trample_num,
SUM ( CASE WHEN array_length( application_chat_record.improve_paragraph_id_list, 1 ) IS NULL THEN 0 ELSE array_length( application_chat_record.improve_paragraph_id_list, 1 ) END ) AS mark_sum,
chat_id
FROM
application_chat_record
GROUP BY
application_chat_record.chat_id
) chat_record_temp ON application_chat."id" = chat_record_temp.chat_id
LEFT JOIN (
SELECT
*,
CASE
WHEN array_length( application_chat_record.improve_paragraph_id_list, 1 ) IS NULL THEN
'{}' ELSE ( SELECT ARRAY_AGG ( row_to_json ( paragraph ) ) FROM paragraph WHERE "id" = ANY ( application_chat_record.improve_paragraph_id_list ) )
END as improve_paragraph_list
FROM
application_chat_record application_chat_record
) application_chat_record_temp ON application_chat_record_temp.chat_id = application_chat."id"
16 changes: 15 additions & 1 deletion apps/common/handle/impl/doc_split_handle.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,25 @@


class DocSplitHandle(BaseSplitHandle):
@staticmethod
def paragraph_to_md(paragraph):
psn = paragraph.style.name
if psn.startswith('Heading'):
try:
return "".join(["#" for i in range(int(psn.replace("Heading ", '')))]) + " " + paragraph.text
except Exception as e:
return paragraph.text
return paragraph.text

def to_md(self, doc):
ps = doc.paragraphs
return "\n".join([self.paragraph_to_md(para) for para in ps])

def handle(self, file, pattern_list: List, with_filter: bool, limit: int, get_buffer):
try:
buffer = get_buffer(file)
doc = Document(io.BytesIO(buffer))
content = "\n".join([para.text for para in doc.paragraphs])
content = self.to_md(doc)
if pattern_list is not None and len(pattern_list) > 0:
split_model = SplitModel(pattern_list, with_filter, limit)
else:
Expand Down
6 changes: 3 additions & 3 deletions apps/common/handle/impl/text_split_handle.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import re
from typing import List

import chardet
from charset_normalizer import detect

from common.handle.base_split_handle import BaseSplitHandle
from common.util.split_model import SplitModel
Expand All @@ -26,7 +26,7 @@ def support(self, file, get_buffer):
file_name: str = file.name.lower()
if file_name.endswith(".md") or file_name.endswith('.txt'):
return True
result = chardet.detect(buffer)
result = detect(buffer)
if result['encoding'] != 'ascii' and result['confidence'] > 0.5:
return True
return False
Expand All @@ -38,7 +38,7 @@ def handle(self, file, pattern_list: List, with_filter: bool, limit: int, get_bu
else:
split_model = SplitModel(default_pattern_list, with_filter=with_filter, limit=limit)
try:
content = buffer.decode(chardet.detect(buffer)['encoding'])
content = buffer.decode(detect(buffer)['encoding'])
except BaseException as e:
return {'name': file.name,
'content': []}
Expand Down
30 changes: 24 additions & 6 deletions apps/common/util/fork.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,8 @@
import traceback
from functools import reduce
from typing import List, Set
from urllib.parse import urljoin, urlparse, ParseResult, urlsplit
from urllib.parse import urljoin, urlparse, ParseResult, urlsplit, urlunparse

import chardet
import html2text as ht
import requests
from bs4 import BeautifulSoup
Expand Down Expand Up @@ -44,6 +43,13 @@ def fork_child(child_link: ChildLink, selector_list: List[str], level: int, excl
ForkManage.fork_child(child_link, selector_list, level - 1, exclude_link_url, fork_handler)


def remove_fragment(url: str) -> str:
parsed_url = urlparse(url)
modified_url = ParseResult(scheme=parsed_url.scheme, netloc=parsed_url.netloc, path=parsed_url.path,
params=parsed_url.params, query=parsed_url.query, fragment=None)
return urlunparse(modified_url)


class Fork:
class Response:
def __init__(self, content: str, child_link_list: List[ChildLink], status, message: str):
Expand All @@ -61,6 +67,7 @@ def error(message: str):
return Fork.Response('', [], 500, message)

def __init__(self, base_fork_url: str, selector_list: List[str]):
base_fork_url = remove_fragment(base_fork_url)
self.base_fork_url = urljoin(base_fork_url if base_fork_url.endswith("/") else base_fork_url + '/', '.')
parsed = urlsplit(base_fork_url)
query = parsed.query
Expand All @@ -74,9 +81,11 @@ def __init__(self, base_fork_url: str, selector_list: List[str]):
fragment='').geturl()

def get_child_link_list(self, bf: BeautifulSoup):
pattern = "^((?!(http:|https:|tel:/|#|mailto:|javascript:))|" + self.base_fork_url + ").*"
pattern = "^((?!(http:|https:|tel:/|#|mailto:|javascript:))|" + self.base_fork_url + "|/).*"
link_list = bf.find_all(name='a', href=re.compile(pattern))
result = [ChildLink(link.get('href'), link) for link in link_list]
result = [ChildLink(link.get('href'), link) if link.get('href').startswith(self.base_url) else ChildLink(
self.base_url + link.get('href'), link) for link in link_list]
result = [row for row in result if row.url.startswith(self.base_fork_url)]
return result

def get_content_html(self, bf: BeautifulSoup):
Expand Down Expand Up @@ -122,9 +131,18 @@ def reset_beautiful_soup(self, bf: BeautifulSoup):

@staticmethod
def get_beautiful_soup(response):
encoding = response.encoding if response.encoding and response.encoding != 'ISO-8859-1' is not None else response.apparent_encoding
encoding = response.encoding if response.encoding and response.encoding is not 'ISO-8859-1' is not None else response.apparent_encoding
html_content = response.content.decode(encoding)
return BeautifulSoup(html_content, "html.parser")
beautiful_soup = BeautifulSoup(html_content, "html.parser")
meta_list = beautiful_soup.find_all('meta')
charset_list = [meta.attrs.get('charset') for meta in meta_list if
meta.attrs is not None and 'charset' in meta.attrs]
if len(charset_list) > 0:
charset = charset_list[0]
if charset is not encoding:
html_content = response.content.decode(charset)
return BeautifulSoup(html_content, "html.parser")
return beautiful_soup

def fork(self):
try:
Expand Down
8 changes: 7 additions & 1 deletion apps/common/util/split_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,9 +351,15 @@ def sub_title(paragraph: Dict):

@staticmethod
def filter_title_special_characters(paragraph: Dict):
return {**paragraph, 'title': paragraph.get('title').replace("#", '') if 'title' in paragraph else ''}
title = paragraph.get('title') if 'title' in paragraph else ''
for title_special_characters in title_special_characters_list:
title = title.replace(title_special_characters, '')
return {**paragraph,
'title': title}


title_special_characters_list = ['#', '\n', '\r', '\\s']

default_split_pattern = {
'md': [re.compile('(?<=^)# .*|(?<=\\n)# .*'), re.compile('(?<!#)## (?!#).*'), re.compile("(?<!#)### (?!#).*"),
re.compile("(?<!#)#### (?!#).*"), re.compile("(?<!#)##### (?!#).*"),
Expand Down
3 changes: 1 addition & 2 deletions apps/dataset/serializers/document_serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,11 @@
from common.util.field_message import ErrMessage
from common.util.file_util import get_file_content
from common.util.fork import Fork
from common.util.split_model import SplitModel, get_split_model
from common.util.split_model import get_split_model
from dataset.models.data_set import DataSet, Document, Paragraph, Problem, Type, Status, ProblemParagraphMapping
from dataset.serializers.common_serializers import BatchSerializer, MetaSerializer
from dataset.serializers.paragraph_serializers import ParagraphSerializers, ParagraphInstanceSerializer
from smartdoc.conf import PROJECT_DIR
import chardet


class DocumentEditInstanceSerializer(ApiMixin, serializers.Serializer):
Expand Down
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ html2text = "^2024.2.26"
langchain-openai = "^0.0.8"
django-ipware = "^6.0.4"
django-apscheduler = "^0.6.2"
chardet2 = "^2.0.3"
pymupdf = "^1.24.0"
python-docx = "^1.1.0"

Expand Down
10 changes: 4 additions & 6 deletions ui/src/api/application.ts
Original file line number Diff line number Diff line change
Expand Up @@ -166,14 +166,12 @@ const getChatOpen: (applicaiton_id: String) => Promise<Result<any>> = (applicait
}
/**
* 对话
* @param 参数
* @param 参数
* chat_id: string
* {
"message": "string",
}
* data
*/
const postChatMessage: (chat_id: string, message: string) => Promise<any> = (chat_id, message) => {
return postStream(`/api${prefix}/chat_message/${chat_id}`, { message })
const postChatMessage: (chat_id: string, data: any) => Promise<any> = (chat_id, data) => {
return postStream(`/api${prefix}/chat_message/${chat_id}`, data)
}

/**
Expand Down
12 changes: 8 additions & 4 deletions ui/src/components/ai-chat/index.vue
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ function sendChatHandle(event: any) {
if (!event.ctrlKey) {
// 如果没有按下组合键ctrl,则会阻止默认事件
event.preventDefault()
if (!isDisabledChart.value && !loading.value&&!event.isComposing) {
if (!isDisabledChart.value && !loading.value && !event.isComposing) {
chatMessage()
}
} else {
Expand Down Expand Up @@ -418,7 +418,7 @@ const errorWrite = (chat: any, message?: string) => {
ChatManagement.append(chat.id, message || '抱歉,当前正在维护,无法提供服务,请稍后再试!')
ChatManagement.close(chat.id)
}
function chatMessage(chat?: any, problem?: string) {
function chatMessage(chat?: any, problem?: string, re_chat?: boolean) {
loading.value = true
if (!chat) {
chat = reactive({
Expand All @@ -443,9 +443,13 @@ function chatMessage(chat?: any, problem?: string) {
errorWrite(chat)
})
} else {
const obj = {
message: chat.problem_text,
re_chat: re_chat || false
}
// 对话
applicationApi
.postChatMessage(chartOpenId.value, chat.problem_text)
.postChatMessage(chartOpenId.value, obj)
.then((response) => {
if (response.status === 401) {
application
Expand Down Expand Up @@ -491,7 +495,7 @@ function chatMessage(chat?: any, problem?: string) {

function regenerationChart(item: chatType) {
inputValue.value = item.problem_text
chatMessage()
chatMessage(null, '', true)
}

function getSourceDetail(row: any) {
Expand Down
19 changes: 13 additions & 6 deletions ui/src/components/app-charts/components/LineCharts.vue
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
<script lang="ts" setup>
import { onMounted, nextTick, watch, onBeforeUnmount } from 'vue'
import * as echarts from 'echarts'
import { numberFormat } from '@/utils/utils'
const props = defineProps({
id: {
type: String,
Expand Down Expand Up @@ -57,12 +58,13 @@ function initChart() {
},
tooltip: {
trigger: 'axis',
axisPointer: {
type: 'cross',
label: {
backgroundColor: '#6a7985'
}
}
valueFormatter: (value: any) => numberFormat(value)
// axisPointer: {
// type: 'cross',
// label: {
// backgroundColor: '#6a7985'
// }
// }
},
legend: {
right: 0,
Expand All @@ -89,6 +91,11 @@ function initChart() {
lineStyle: {
color: '#EFF0F1'
}
},
axisLabel: {
formatter: (value: any) => {
return numberFormat(value)
}
}
},
series: series
Expand Down
4 changes: 2 additions & 2 deletions ui/src/views/applicaiton-overview/index.vue
Original file line number Diff line number Diff line change
Expand Up @@ -197,8 +197,8 @@ function getAppStatistics() {

function refreshAccessToken() {
MsgConfirm(
`是否重新生成公共访问链接?`,
`重新生成公共访问链接会影响嵌入第三方脚本变更,需要将新脚本重新嵌入第三方,请谨慎操作!`,
`是否重新生成公开访问链接?`,
`重新生成公开访问链接会影响嵌入第三方脚本变更,需要将新脚本重新嵌入第三方,请谨慎操作!`,
{
confirmButtonText: '确认'
}
Expand Down
2 changes: 1 addition & 1 deletion ui/src/views/dataset/component/UploadComponent.vue
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
<em> 选择文件上传 </em>
</p>
<div class="upload__decoration">
<p>支持格式:TXT、Markdown,每次最多上传50个文件,每个文件不超过 10MB</p>
<p>支持格式:TXT、Markdown、PDF、DOC、DOCX,每次最多上传50个文件,每个文件不超过 10MB</p>
<p>若使用【高级分段】建议上传前规范文件的分段标识</p>
</div>
</div>
Expand Down