-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Enabling training on MCPE pulses #29
base: master
Are you sure you want to change the base?
Conversation
@mhuen I would be interested to know what is missing for potentially merging this. There are probably a few shortcoming in the current implementation. For example: I allow to read MCPE pulses from hdf5 files, the function which read data from i3 files (i.e. frames) are not compatible yet. |
@@ -876,7 +868,7 @@ def train( | |||
# -------------------------- | |||
# evaluate on validation set | |||
# -------------------------- | |||
if step % opt_config["validation_frequency"] == 0: | |||
if step % opt_config["validation_frequency"] == 0 and step: # not validate on the fist iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually find this first point very interesting and important, since it provides the baseline of the untrained model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have, by the way, also tweaked the data input pipeline a bit in one of the branches feeding into the CollectBreakingChanges
. Some of the work is now distributed further to the worker nodes, such that it is more easily parallelized, which will speed up the data pipeline. If the reason for you to make this change was to reduce the setup time until training starts, you could use this newer branch and reduce the number of additional files that you require per event pool in the validation data iterator or increase the number of workers to match the number of files. That way you only need one iteration of file loading until the queue is populated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Okay I will revert this than.
file_or_frame, *args, **kwargs | ||
) | ||
file_or_frame, *args, | ||
label_key=self.label_module.configuration.config["label_key"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to remain in the kwargs. Modules are not forced to specify/use a label_key
. But I also have a slightly different implementation in the CollectBreakingChanges
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure that I fully understand why it hace to remain in kwargs.
I am also in the process of some larger changes in event-generator, including some necessary breaking changes. This is currently being collected in the branch What I have done so far when training on MCPEs, was simply to save the MCPE as reco pulses in hdf5 format. Which is also the workaround I chose for other datatypes that didn't have an icetray hdf converter setup yet. In any case, I am in the process of implementing a number of updates. And since these will necessarily break compatibility to older models, I am also cleaning up the code a little bit, and removing old parts that were added for compatibility. Perhaps it's best to merge these changes to that branch? Then one can also clean up some of the stuff, when not having to worry about backwards compatibility |
Just a note, we do successful train models on MCPEs. If you meant with changes "down the line" might be necessary to train them, this is working already (the question if the training can be as good as possible might be a different one). |
In the context of larger changes. So far it seems that the asymmetric gaussians are hardcoded in many places. I would like to test a triple pandel function instead. I think the time pdf model could be made flexible but it is a more inversive change I think. Were you ever planing of doing this? |
It's more about once the model is exported and applied later on. Say you want to test running it on reco pulses or so. Does that work? |
I was thinking about testing this. But yeah, this would involve a bit of restructuring of the models. I think the rest of the code should be modular enough that it doesn't matter for them. But the current base source model and derived classes assume the asymmetric Gaussians |
I think the base class is actually fine. It's just the utility functions (pdf, cdf) that assume it. But these are only needed for debugging purposes. Training and application of the model itself uses the |
What do you mean with running it on reco pulses? We have not done much with the models yet. Only started evaluating the prediction of the total charge against MC and photonics. |
I will leave in a week to Greenland. I probably wont be able to do something in this direction before but I can have a look afterwards. |
Do you remember if the triple pandel function had an analytic CDF? Or at least if it's easy to evaluate with existing tensorflow functions? (We need to have the gradients) |
With this PR one can use event-generator to train models on MCPE pulses and hence use the generative models for simulations.
In addition to the new feature, this PR introduces a small number of small improvements. For example: