Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Readability #10

Open
Anonyfox opened this issue Apr 15, 2015 · 0 comments
Open

Optimize Readability #10

Anonyfox opened this issue Apr 15, 2015 · 0 comments
Assignees

Comments

@Anonyfox
Copy link
Owner

The used readability module often returns garbage from the parsed HTML sites, which leads not only to unusable fulltext properties, but also to awful wrong matches in the tagging engine and sometimes crappy summary texts.

A custom optimized readability algorithm is needed, that is more accurate than the current implementation, and as fast as possible (<100ms on casual hardware and common websites).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants