-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
可能还可以优化的地方 #4
Comments
感谢建议,我考虑下优化 |
断字处理可能可以这么做? s = '-tangbing-bing-tb-b' |
|
发布了pinyin-match模块,解决了分词、长多音字串的问题,也请支持下~ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
1.拼音做断字处理更符合搜索习惯
如:糖饼(tang bing) 现在输入 gb 也能搜到,可以在participle()的时候处理下
2.可能整个匹配算法(主要是做组合的时候,实际上是笛卡尔积?)需要做优化,这点现在也没什么好思路,只是看了下微信app能处理得非常好
pinyin-engine在处理长多音字的时候是存在问题的,比如:
‘曾大曾大曾大曾大曾大曾大曾大曾大曾大曾大曾大曾大’ (zeng ceng, da dai tai)这里总共20个多音字,有6^10次方个组合,会直接导致内存撑爆,浏览器卡死,在测试16个字的时候需要处理近一秒(chrome 61)
建议可以暂时限制处理的多音字的个数
The text was updated successfully, but these errors were encountered: