Draft: Enabling training on MCPE pulses #29

fschlueter · 2024-06-04T11:07:06Z

With this PR one can use event-generator to train models on MCPE pulses and hence use the generative models for simulations.

In addition to the new feature, this PR introduces a small number of small improvements. For example:

Specification of the Label key in the config file
Specification of the source hypothesis parameter names in the config
Calculating pdf, cdf, and probability quantiles per dom

…rning

fschlueter · 2024-06-04T11:11:43Z

@mhuen I would be interested to know what is missing for potentially merging this. There are probably a few shortcoming in the current implementation. For example: I allow to read MCPE pulses from hdf5 files, the function which read data from i3 files (i.e. frames) are not compatible yet.

mhuen · 2024-06-06T08:05:53Z

egenerator/manager/base.py

@@ -876,7 +868,7 @@ def train(
            # --------------------------
            # evaluate on validation set
            # --------------------------
-            if step % opt_config["validation_frequency"] == 0:
+            if step % opt_config["validation_frequency"] == 0 and step:  # not validate on the fist iteration


I actually find this first point very interesting and important, since it provides the baseline of the untrained model

I have, by the way, also tweaked the data input pipeline a bit in one of the branches feeding into the CollectBreakingChanges. Some of the work is now distributed further to the worker nodes, such that it is more easily parallelized, which will speed up the data pipeline. If the reason for you to make this change was to reduce the setup time until training starts, you could use this newer branch and reduce the number of additional files that you require per event pool in the validation data iterator or increase the number of workers to match the number of files. That way you only need one iteration of file loading until the queue is populated

Great! Okay I will revert this than.

mhuen · 2024-06-06T08:15:33Z

egenerator/data/handler/modular.py

-                file_or_frame, *args, **kwargs
-            )
+                file_or_frame, *args,
+                label_key=self.label_module.configuration.config["label_key"],


I think this needs to remain in the kwargs. Modules are not forced to specify/use a label_key. But I also have a slightly different implementation in the CollectBreakingChanges branch.

Not quite sure that I fully understand why it hace to remain in kwargs.

mhuen · 2024-06-06T08:17:53Z

I am also in the process of some larger changes in event-generator, including some necessary breaking changes. This is currently being collected in the branch CollectBreakingChanges. The reason I mention this here is because some of these features are also added there, such as the variable label key name. The MCPE stuff may have some further reaching implications, as the pulse type may differ when applying the model. So one would have to think that one through and see if it would work down the line and/or if there are additional places in the code that need adjustment. I do see that you added it as a mutable setting, which is good, but there may be other necessary modifications.

What I have done so far when training on MCPEs, was simply to save the MCPE as reco pulses in hdf5 format. Which is also the workaround I chose for other datatypes that didn't have an icetray hdf converter setup yet. In any case, I am in the process of implementing a number of updates. And since these will necessarily break compatibility to older models, I am also cleaning up the code a little bit, and removing old parts that were added for compatibility. Perhaps it's best to merge these changes to that branch? Then one can also clean up some of the stuff, when not having to worry about backwards compatibility

fschlueter · 2024-06-06T08:30:10Z

Just a note, we do successful train models on MCPEs. If you meant with changes "down the line" might be necessary to train them, this is working already (the question if the training can be as good as possible might be a different one).

fschlueter · 2024-06-06T08:31:59Z

In the context of larger changes. So far it seems that the asymmetric gaussians are hardcoded in many places. I would like to test a triple pandel function instead. I think the time pdf model could be made flexible but it is a more inversive change I think. Were you ever planing of doing this?

mhuen · 2024-06-06T08:32:01Z

It's more about once the model is exported and applied later on. Say you want to test running it on reco pulses or so. Does that work?

mhuen · 2024-06-06T08:35:47Z

In the context of larger changes. So far it seems that the asymmetric gaussians are hardcoded in many places. I would like to test a triple pandel function instead. I think the time pdf model could be made flexible but it is a more inversive change I think. Were you ever planing of doing this?

I was thinking about testing this. But yeah, this would involve a bit of restructuring of the models. I think the rest of the code should be modular enough that it doesn't matter for them. But the current base source model and derived classes assume the asymmetric Gaussians

mhuen · 2024-06-06T08:39:10Z

I think the base class is actually fine. It's just the utility functions (pdf, cdf) that assume it. But these are only needed for debugging purposes. Training and application of the model itself uses the get_tensors method and all this requires is a result tensor named pulse_pdf to which the evaluated PDF for each pulse/pe is written to. So one should be able to test this fairly easily, by simply adding a derived class that utilizes a different mixture basis

fschlueter · 2024-06-06T08:41:26Z

It's more about once the model is exported and applied later on. Say you want to test running it on reco pulses or so. Does that work?

What do you mean with running it on reco pulses? We have not done much with the models yet. Only started evaluating the prediction of the total charge against MC and photonics.

fschlueter · 2024-06-06T08:43:08Z

I think the base class is actually fine. It's just the utility functions (pdf, cdf) that assume it. But these are only needed for debugging purposes. Training and application of the model itself uses the get_tensors method and all this requires is a result tensor named pulse_pdf to which the evaluated PDF for each pulse/pe is written to. So one should be able to test this fairly easily, by simply adding a derived class that utilizes a different mixture basis

I will leave in a week to Greenland. I probably wont be able to do something in this direction before but I can have a look afterwards.

mhuen · 2024-06-06T08:44:54Z

Do you remember if the triple pandel function had an analytic CDF? Or at least if it's easy to evaluate with existing tensorflow functions? (We need to have the gradients)

fschlueter and others added 29 commits April 9, 2024 08:50

Simplify code

e7a5ed9

Remove hard-coded label_key LabelsDeepLearning. Allow to train on MCPEs.

cb05b28

Fixed setting charge index. Slightly refactor code to avoid future wa…

009d351

…rning

Add new config file for training on MCPEs

bd3a139

Keep training and testing data independent

0dc9634

Allow to specify parameter names in config

871c5c8

Propagate new argument correctly

cc2a8da

Update config for MCPE training

37abda5

Add missing config

10fadf1

Set number of iteration to 1M

4f4bc0c

Update MCPE config

1d2f440

Do not run validation in the first iteration

382fd55

Add function which evaluates pdf only for one DOM

8d51035

Add module to evaluate mcpe eg models

e0863ac

Clean up. Add correct time call to photonics function

3229861

Add cdf_per_dom function and get_probability_quantiles_per_dom

252a91f

Little clean up

b85d0ee

Store total charge per dom as well. Improve filename for figures

8d62758

Add function which only calculates total charge per dom

16f6cbc

fix ws

ea00a1d

Remove hard-coded label_key LabelsDeepLearning. Allow to train on MCPEs.

f1fb9e8

Fixed setting charge index. Slightly refactor code to avoid future wa…

0c71958

…rning

Add new config file for training on MCPEs

bc591ae

Export to hdf5 rather than i3

84a74bd

Merge branch 'master' into enable_mcpe_training

22392b8

Allow evaluation in eager mode

d837934

Code refactoring

de144f8

Merge branch 'FixMultiLearningRateScheduler' into enable_mcpe_training

c7f9187

Improve code. access charge from hdf files by name and not index

dec3eb0

mhuen reviewed Jun 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Enabling training on MCPE pulses #29

Draft: Enabling training on MCPE pulses #29

fschlueter commented Jun 4, 2024 •

edited

Loading

fschlueter commented Jun 4, 2024

mhuen Jun 6, 2024

mhuen Jun 6, 2024

fschlueter Jun 6, 2024

mhuen Jun 6, 2024

fschlueter Jun 6, 2024

mhuen commented Jun 6, 2024

fschlueter commented Jun 6, 2024

fschlueter commented Jun 6, 2024

mhuen commented Jun 6, 2024

mhuen commented Jun 6, 2024

mhuen commented Jun 6, 2024

fschlueter commented Jun 6, 2024

fschlueter commented Jun 6, 2024

mhuen commented Jun 6, 2024

Draft: Enabling training on MCPE pulses #29

Are you sure you want to change the base?

Draft: Enabling training on MCPE pulses #29

Conversation

fschlueter commented Jun 4, 2024 • edited Loading

fschlueter commented Jun 4, 2024

mhuen Jun 6, 2024

Choose a reason for hiding this comment

mhuen Jun 6, 2024

Choose a reason for hiding this comment

fschlueter Jun 6, 2024

Choose a reason for hiding this comment

mhuen Jun 6, 2024

Choose a reason for hiding this comment

fschlueter Jun 6, 2024

Choose a reason for hiding this comment

mhuen commented Jun 6, 2024

fschlueter commented Jun 6, 2024

fschlueter commented Jun 6, 2024

mhuen commented Jun 6, 2024

mhuen commented Jun 6, 2024

mhuen commented Jun 6, 2024

fschlueter commented Jun 6, 2024

fschlueter commented Jun 6, 2024

mhuen commented Jun 6, 2024

fschlueter commented Jun 4, 2024 •

edited

Loading