-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Have the
.tmp
files in the index directory
Reason: The merge was very SLOW when these were in the vocabulary directory, which for our UniProt index builds is on HDD (because the external vocabulary is so larger). I first tried to only have the `.tmp.partial-vocabulary.words` files in the index directory, but that was still slow. Now also the `.tmp.partial-vocabulary.ids` files are in the index directory. Explanations concerning SLOW: The merging of the first few 100M triples is fast (30 seconds per 100M triples). Then it becomes slow and then very slow (half an hour from 700M triples to 800M triples). Not only is it slow, but doing other stuff on the machine (like wrting something in an editor with autosave on) becomes very slow to respond to, which is a clear sign that the random accesses to HDD are the problem. NOTE: With the partial solution, where `.tmp.partial-vocabulary.words` are on SSD and `.tmp.partial-vocabulary.ids` are on HDD, it is not as bad. There was a very significant slow-down from 700M to 1100M triples, but after that merging was as fast again (though not as fast as in the beginning). At the time of this writing, I only observed until 1700M, stay tuned for more information.
- Loading branch information
Hannah Bast
committed
Jan 27, 2024
1 parent
728c8a7
commit 6799a37
Showing
6 changed files
with
27 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters