Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

允许翻译自订的选取范围 #349

Open
charles7668 opened this issue Dec 26, 2024 · 1 comment
Open

允许翻译自订的选取范围 #349

charles7668 opened this issue Dec 26, 2024 · 1 comment

Comments

@charles7668
Copy link
Contributor

功能描述

目前的功能为自动判别文字区块 , 不好微调, 也许可以加个手动选取范围的模式

例如以 #323 的pdf为例, 可以进行如下选择以避免将行号给包含进去
圖片

同时此作法也更容易支持图片的翻译, 由于范围可以自订, 在处理图片原有文字时会比较容易

@aseaday
Copy link
Contributor

aseaday commented Dec 26, 2024

There are two steps to fix that problem:

  • PDFMathTranslate highly depends on the Layout Analysis from pdfminer, There may be something wrong to correctly recognize the serial number and block text even we give a fitler box. The serial number can be grouped into a textline in some cases.
  • A better GUI may be a custome web app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants