Batching Graphs in PygPCQM4MDataset #149
edwardelson
started this conversation in
PCQM4M-LSC
Replies: 1 comment 2 replies
-
Hi! Yes, we thought about that option, but we think 8GB should be manageable for most of the RAM on the server. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi thanks for preparing the processing code!
Was just thinking if batching the smiles graph into separate torch files would be a feasible solution to reduce memory requirement? I notice in the
process()
function of classPygPCQM4MDataset(InMemoryDataset)
, the list of graphs obtained from the smiles strings are all combined into a single dataset, and subsequentlytorch.save
'd into one file (only to be split again later on to differentdataloader
s? during training and testing)Since all of the graphs are independent of each other, would it be possible to perhaps save these into a couple of torch files, each made of batches of several graphs
data
to reduce RAM requirement?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions