Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

加载成功模型,但因编码问题无法成功向量化词 #23

Open
xuexingdong opened this issue Oct 20, 2016 · 4 comments
Open

Comments

@xuexingdong
Copy link

使用时遇到一个奇怪的bug,在Eclipse中能运行成功,maven打成jar包放在控制台里运行模型就无法向量化词语了,经过不断调试发现以下解决方案
在Word2vec.java的readString方法中
sb.append(new String(bytes));

sb.append(new String(bytes, 0, i + 1));
改为
sb.append(new String(bytes, "UTF-8"));

sb.append(new String(bytes, 0, i + 1, "UTF-8"));

@ansjsun
Copy link
Member

ansjsun commented Mar 4, 2017

thx

@dhaimeng
Copy link

dhaimeng commented Dec 6, 2017

你好,我用google训练好的二进制bin模型,在Word2vec.java的readFloat方法中报错,byte[] bytes = new byte[4]行内存溢出,改为byte[50]后readString()byte b = dis.readByte()报错java.io.EOFException
继续修改MAX_SIZE至100还是不行。请问是什么问题呢?

@ansjsun
Copy link
Member

ansjsun commented Dec 9, 2017

把代码发上来。。要是方便。把模型发我邮箱

@dhaimeng
Copy link

问题已经解决,感谢。原因是只改了eclipse的.ini配置文件,没有修改项目的JVM分配内存,修改大以后运行成功。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants