Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]同处于一行的文本,有空格间隔时,前面内容反而排在后面 #657

Open
1 task done
lforlgg opened this issue Sep 10, 2024 · 2 comments
Open
1 task done

Comments

@lforlgg
Copy link

lforlgg commented Sep 10, 2024

Issues

  • I have browsed through the Issues. 我已浏览过Issues,确定没有重复提问。

Umi-OCR version 程序版本

2.1.4

Windows version 系统版本

win10

Reproduction steps 复现步骤

明明同处于一行的文本,前面内容反而排在后面...感觉排版方案需要改机一下。
因为不做任何处理,反而正常的,是顺的。

[OCR不做处理]_test-版面排版【唯一效果正确】
[OCR单栏-按自然段]_test-版面排版
[OCR单栏-总是换行]_test-版面排版
[OCR多栏-按自然段]_test-版面排版

ocrTxt_ziped
ocr测试

test-版面排版.pdf

@hiroi-sora
Copy link
Owner

反馈收到。我们的排版解析是基于文章排版设计的。对于题册这类含大量不规则元素的排版,确实会干扰到排序算法。未来会继续优化。

@lforlgg
Copy link
Author

lforlgg commented Sep 10, 2024

反馈收到。我们的排版解析是基于文章排版设计的。对于题册这类含大量不规则元素的排版,确实会干扰到排序算法。未来会继续优化。

感谢,去除下方大部分内容,竟然就不颠倒了,有点意思,没想到会受到后面那么远的内容倒过来影响顺序...
test-版面排版测试2(这样的版面ocr出来排版正常).pdf
test2_ziped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants