-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waveform attributes #745
Waveform attributes #745
Changes from 5 commits
cc962f2
cea7435
094dac8
3f87c4c
85ab06d
c7a265d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -100,3 +100,52 @@ def compute_widths(peaks, select_peaks_indices=None): | |
i = len(desired_fr) // 2 | ||
peaks['width'][select_peaks_indices] = fr_times[:, i:] - fr_times[:, ::-1][:, i:] | ||
peaks['area_decile_from_midpoint'][select_peaks_indices] = fr_times[:, ::2] - fr_times[:, i].reshape(-1,1) | ||
|
||
@export | ||
#@numba.njit(nopython=True, cache=True) | ||
def compute_wf_attributes(data, sample_length, n_samples: int, downsample_wf=False): | ||
""" | ||
Compute waveform attribures | ||
Quantiles: represent the amount of time elapsed for | ||
a given fraction of the total waveform area to be observed in n_samples | ||
i.e. n_samples = 10, then quantiles are equivalent deciles | ||
Waveforms: downsampled waveform to n_samples | ||
:param data: waveform e.g. peaks or peaklets | ||
:param n_samples: compute quantiles for a given number of samples | ||
:return: waveforms and quantiles of size n_samples | ||
""" | ||
assert data.shape[0] == len(sample_length), "ararys must have same size" | ||
|
||
num_samples = data.shape[1] | ||
|
||
waveforms = np.zeros((len(data), n_samples), dtype=np.float64) | ||
quantiles = np.zeros((len(data), n_samples), dtype=np.float64) | ||
|
||
# Cannot compute with with more samples than actual waveform sample | ||
assert num_samples > n_samples, "cannot compute with more samples than the actual waveform" | ||
|
||
|
||
step_size = int(num_samples / n_samples) | ||
steps = np.arange(0, num_samples + 1, step_size) | ||
inter_points = np.linspace(0., 1. - (1. / n_samples), n_samples) | ||
cumsum_steps = np.zeros(n_samples + 1, dtype=np.float64) | ||
frac_of_cumsum = np.zeros(num_samples + 1) | ||
sample_number_div_dt = np.arange(0, num_samples + 1, 1) | ||
for i, (samples, dt) in enumerate(zip(data, sample_length)): | ||
if np.sum(samples) == 0: | ||
continue | ||
# reset buffers | ||
frac_of_cumsum[:] = 0 | ||
cumsum_steps[:] = 0 | ||
frac_of_cumsum[1:] = np.cumsum(samples) | ||
frac_of_cumsum[1:] = frac_of_cumsum[1:] / frac_of_cumsum[-1] | ||
cumsum_steps[:-1] = np.interp(inter_points, frac_of_cumsum, sample_number_div_dt * dt) | ||
cumsum_steps[-1] = sample_number_div_dt[-1] * dt | ||
quantiles[i] = cumsum_steps[1:] - cumsum_steps[:-1] | ||
|
||
if downsample_wf: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this whole downsampling makes no sense to me.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is because the bayes pluggin needs the waveform but the SOM plugin does not. So if we dont need it we can save computational time by not calculating this, but if you know of a better way to achive this goal I am more than happy to change this. As to your last point I do not really understand what you are saying, can you elaborate a bit more? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets take a simple example. num_samples = 10
n_samples = 6
step_size = int(num_samples / n_samples)
steps = np.arange(0, num_samples + 1, step_size)
for j in range(n_samples):
print(steps[j], ":", steps[j + 1])
0 : 1
1 : 2
2 : 3
3 : 4
4 : 5
5 : 6 Does this look like correct downsampling to you? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok I think I know what you are saying now. So this is dealing with waveforms so num_samples will always be 200, but there are no safeguards against non integer divisions of the data. We can make an assert statement to make sure num_samples is divisible by n_samples. I will make that change. The waveforms[i] /= (step_size * dt) I am now a bit confused about, the objective is to get rid of the dt dependance so we can compare all wavefroms to each other, but I did some quick dimensional analysis and I think we should be multiplying by dt instead of dividing since waveform amplitude is in units of PE/sample = PE/(dt) so we should be multiplying the numerator by dt. @ahiguera-mx since you worked on this more with the Bayes classification is there something we are missing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok we dont need the downsampling for now so we can just remove that funcitonality, and make an assert statement to make sure data quantiles make sense, but the big question still is, given the difference between the functions, do we still want to try to merge it into one or should I leave them as separate things? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would try to either:
|
||
|
||
for j in range(n_samples): | ||
waveforms[i][j] = np.sum(samples[steps[j]:steps[j + 1]]) | ||
waveforms[i] /= (step_size * dt) | ||
return quantiles, waveforms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work or why is it commented out? In addition, is not this the same as this function is already doing:
strax/strax/processing/peak_properties.py
Line 28 in 9b508f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Daniel
Thanks for checking at this, the functions calculate different things
compute_index_of_fraction()
return the index at which certain area has been achieved if the fraction is 1 it will return peak['length'] so this is in units of samples.compute_wf_attributes()
for a given set of quantiles (fractions) it return the the amount of time elapsed for a given fraction of the total waveform area to be observedSo in other words,
compute_index_of_fraction()
is in units of samples andcompute_wf_attributes()
is in units of nsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also
compute_wf_attributes()
can handle an array of peaks whereascompute_index_of_fraction()
can do only peak by peakDo you think it would better to merge into one function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at these codes,
compute_index_of_fraction
can also return floats. I think it is better to utilizecompute_index_of_fraction
or modify it.strax/strax/processing/peak_properties.py
Line 48 in 9b508f7