-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A time clustering algorithm for image cleaning #2401
base: main
Are you sure you want to change the base?
A time clustering algorithm for image cleaning #2401
Conversation
ctapipe/image/cleaning.py
Outdated
|
||
X = np.column_stack((time[precut_mask] / t_scale, pix_x, pix_y)) | ||
|
||
db = DBSCAN(eps=eps, min_samples=minpts).fit(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly shorter: labels = DBSCAN(...).fit_predict(X)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also use sample_weights=image
? Otherwise you are using image only for the initial cut
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried sample_weight
before. I used this weighting:
It introduces two new parameters but I can add weights as an option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply weight with the image itself? I.e. cluster the photo electrons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem to make any difference though
Docs failure is fixed in main, please rebase |
be9949d
to
de7b550
Compare
Not sure if it's interesting, but they just introduced an improved DBSCAN algorithm in scikit-learn |
ctapipe/image/cleaning.py
Outdated
n_noise=self.noise_cut.tel[tel_id], | ||
d_scale=self.d_scale.tel[tel_id], | ||
t_scale=self.t_scale.tel[tel_id], | ||
pedestal=self.pedestal.tel[tel_id], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here you should really use the correct pedestal, not some hard-coded value. There is for example
ped = event.mon.tel[7].calibration.pedestal_per_sample
That is per channel, so you need to use the high/low gain switch mask. We probably should properly process simulated pedestals into DL1 information though, so you don't have to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should change the name but what I need is the variance of the reconstructed noise per pixel in units of pe. Is that available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i realized that just after posting... the pedestals themselves are not what you need. In the simulations currently we don't have the variances, but this is something we really need to fix. Either we should properly compute pedestals and not rely on the ones given in the sims (which are technically inputs, not outputs), or see if somewhere the variances are already included or ask for them to be included. That's not just or this PR, but for any "realistic" cleaning method, e.g. the Whipple-10m-style cleaning that is just tailcuts but with the thresholds expressed in pedestal variance units.
This should probably be a separate issue and PR. For now, what you can maybe do is just don't have these pedestal dispersions hard-coded in the class, but rather make them a TelescopeParameter that is configurable. That way we could adapt to multiple prods without having to edit the code. Once we have a "correct" way to read the pedestal dispersions for sims, we can then properly fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should definitely change the name, since "pedestal" is the mean value of the noise, not the dispersion. Here, you are assuming it's Gaussian, right (even though it's not)? And you are using it as a standard deviation, not variance, so maybe pedestal_std
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it also occurs to me that this threshold could also just be a threshold in PE for now (like the tailcuts), which would work fine for simulations, where the pedestal variances don't vary across the FOV or in time. Then later we upgrade it to use pedestal variances once we have those properly implemented for simulations.
This is the basic implementation of the method. I think we could finish it soon and then if we need to add some more details, which is possible, we could do so. I am just wondering about unit tests |
4856259
to
1f94003
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2401 +/- ##
=======================================
Coverage ? 92.48%
=======================================
Files ? 234
Lines ? 20005
Branches ? 0
=======================================
Hits ? 18502
Misses ? 1503
Partials ? 0 ☔ View full report in Codecov by Sentry. |
""" | ||
|
||
space_scale_m = FloatTelescopeParameter( | ||
default_value=0.25, help="Pixel space scaling parameter in m" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we determine the scale parameters from the information in CameraDescription
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. That parameter needs optimization, it could 2x or 3x the pixel spacing of the camera and teh optimized number depends on the telescope
173c482
to
afc1704
Compare
I am not sure what happened here. @clara-escanuela What git commands did you run? I see issues like this one from time to time but never could figure out what mistake leads to these. The proper way of updating a branch is (assuming upstream points to cta-observatory/ctapipe and origin points to clara-escanuela/ctapipe)
I think could locally salvage this branch here by cherry-picking the commits that seem to be originally from @clara-escanuela on this branch after resetting the branch hard to
|
No description provided.