Skip to content
This repository has been archived by the owner on May 14, 2020. It is now read-only.

Training on large corpora is extremely slow, anyway to parallelize the pattern detector? #17

Open
jhashemi opened this issue Nov 1, 2014 · 5 comments

Comments

@jhashemi
Copy link

jhashemi commented Nov 1, 2014

No description provided.

@nabilblk
Copy link

nabilblk commented Nov 2, 2014

+1 , the Training is extremely slow

@kbastani
Copy link
Owner

kbastani commented Nov 3, 2014

Can you provide memory configurations? Please copy and paste your properties from neo4j.properties in the neo4j /conf directory.

Recommended memory settings are below:

neostore.nodestore.db.mapped_memory=512M
neostore.relationshipstore.db.mapped_memory=2048M
neostore.propertystore.db.mapped_memory=1024M
neostore.propertystore.db.strings.mapped_memory=500M
neostore.propertystore.db.arrays.mapped_memory=500M

This configuration assumes you have at least 8GB of available system memory.

@jhashemi
Copy link
Author

Definitely helped training, but now classification takes upwards of 3+minutes per entity. This is using a HA cluster

@kbastani
Copy link
Owner

Glad to hear it helped training. I'm going to need more information about your dataset in order to get you fixed up. You can reach me on Skype at kenny.bastani or e-mail [email protected].

@letronje
Copy link

Using the recommended memory settings above certain improves the training speed(most requests are sub-second). Classification requests take anywhere between 15 to 30 seconds. Any way to speed them up ? Also, if multiple classify requests are sent in parallel, it throws a 500.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants