-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow compute to return a generator instead of chunks #751
Conversation
I wont be able to fix the codefactor complained about a too complex method. So I would propose to merge without fixing it. |
This is essentially just online rechunking right? |
I guess that from the logic of |
Yes.
Hmm not sure if this will work. As Dacheng pointed out the overflow will happen while performing the tasks in the compute method. So compute must yield early. For example in case of wfsim and fuse we will transform some information in form of photons detected by a PMT (time, channel) into raw_records. This means that suddenly you blow up your information from ~10 Byte/photon to ~10 + 220 Byte/photon.
I can see your point, I am also not 100 % happy about this solution. An alternative solution would be to develop a dedicated ChunkDown plugin class with a dedicated do_compute and iter method (which is of course much more work). But sure let us discuss in team A. |
What is the problem / what does the code in this PR do
Slight modification which allows to yield chunks within a plugin instead of returning them.
This allows to reduce the chunk size while creating some new data. This is not needed for normal processing of data, but for simulations. Simulations starts with a small list of photons which will then be changed into pulses and fragments where each photon takes then 110 x 16 bit of data. Thus, in such a case it is helpful to yield multiple smaller chunks for a singe larger input chunk.
Since the chunking needs to be done while computing, the plugin's compute method needs to yield the data early. An example plugin can be found in testutils.
This change will help fuse to avoid the out-of-memory issues we were facing with wfsim leading to a more reliable and stable performance.