-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with awkward array masking with exclusive_jets_constituent_index() #238
Comments
I made the wrong assumption. You showed that the error definitely happens in |
Hi @jpivarski, OK! So .... what is best for us to do here? Would you like us to open an issue in Awkward, or should we leave that to you? Since the working example we have relies on Fastjet, it's not entirely clear to us how to provide useful feedback to Awkward on this. Thoughts welcome!! |
The crossed-out part was assuming that it's a bug in Awkward (in It's implemented here: fastjet/src/fastjet/_multievent.py Lines 298 to 307 in 0ede4d3
You've verified that |
Hi @jpivarski, the error is thrown on |
So I followed Lines 365 to 436 in 0ede4d3
which is unpacking the NumPy array inputs, constructing FastJet objects, running FastJet algoriths, and then packing the results into NumPy array outputs. Since the issue is that an index is off, it's probably not the translation between NumPy arrays and FastJet objects, but in handling the FastJet objects. Maybe an off-by-one error somewhere? |
Hi, after looking into this a bit more I've realized that this is a problem with more than just
Specifically I noticed that
Instead, Also, Please let me know if any of this is unclear and thank you for all the help with this! |
You are absolutely right. The way we handle masked arrays is wrong. Running px, py, pz, E, offsets = self.extract_cons(self.data)
print(offsets) gives array([ 0, 2, 4, 8, 10]) The correct layout is >>> eg[mask].layout
<ListArray len='4'>
<starts><Index dtype='int64' len='4'>
[0 2 6 8]
</Index></starts>
<stops><Index dtype='int64' len='4'>
[ 2 4 8 10]
</Index></stops>
... This explains why you see all these extra elements between offsets 4 and 8. |
It sounds like a good idea. (I haven't looked deeply into the details.) |
@chrispap95 / @jpivarski any movement on this? It's blocking progress for future collider analysis, it would nice to have it sorted out. |
Hello, I am working with @kpachal, @mswiatlo, and @lgray on the development of Coffea and some analysis that involves the use of fastjet. We've been very happy with how smoothly everything using awkward arrays integrates with fastjet.
However, we have noticed a problem when running exclusive_jets_constituent_index() with a masked awkward array. If an awkward array has a boolean mask applied to it on axis 0 before clustering, exclusive_jets_constituent_index() returns indices that are out of range. An error is also thrown when trying to call exclusive_jets_constituents() since the out of range indices are being applied to the array.
Here is an example. Running this
returns
which contains out of range indices at index 2 on axis 0. If we then run
it throws the error
due to trying to use the out of range indices.
This is not a problem if we just run the unmasked array
which gives
Thank you for the help on this issue, it will be greatly appreciated!
The text was updated successfully, but these errors were encountered: