Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O2sim: fix kine published only sending first <aggregate> events #12141

Closed
wants to merge 2 commits into from

Conversation

aalkin
Copy link
Member

@aalkin aalkin commented Oct 25, 2023

Proper fix for the issue from #12140

@aalkin aalkin requested a review from a team as a code owner October 25, 2023 08:13
@aalkin aalkin changed the title O2sim: fix kine published only sending first <aggregate> event O2sim: fix kine published only sending first <aggregate> events Oct 25, 2023
@DanielSamitz
Copy link
Contributor

Hi @aalkin , thank you. This definitely fixes the issue that always the same events were sent.

I tested this and I noticed that in my analysis task that picks up the events sent by the kine_publisher, the run function gets called exactly one time, with all nEvents events in the McHeader and McTracks input, independent of what I set for --aggregate-timeframe in the kine_publisher. Because the while loop here is never left until all events are processed. Is this the expected behavior?

I thought (but maybe I misunderstood) that the run function in my analysis task should be called multiple times, each time with exactly aggregate number of events being processed. That's what I saw with my version in #12140 .
Which one is the expected behavior that this aggregate-timeframe option in the kine_publisher should produce?

@aalkin
Copy link
Member Author

aalkin commented Oct 25, 2023

The issue with both kine- and hepmc-publisher is that they are both DPL sources by themselves, there is no "timing" input like in other cases - in analysis and with mctracks-proxy. Therefore their core functions are called exactly once. But the fact that all events are in one message is definitely not expected. At least this is not what I see with hepmc-publisher that works in the same way. I will investigate.

I am planning to make them work similar to the mctracks-proxy eventually, by ensuring they have a "timing" input that we can use to designate timeframes and thus simplify their core functions. Probably I should speed this up.

@DanielSamitz
Copy link
Contributor

Ok, it would be great to have a working version (not perfect, just working for the moment) online before the O2 Tutorial preparation day, because we would like to inlcude something that relies on the kine_publisher in the hands-on.

With #12140 everything seems to work so far.
With this version I crash with the error:

[17011:o2sim-kine-publisher]: [14:21:12][ERROR] Exception caught: shmem: could not create a message of size 482820, alignment: 64, free memory: 26048

when I run on larger files.

@aalkin
Copy link
Member Author

aalkin commented Oct 25, 2023

#12140 works because it only returns first aggregate events and exits the for loop, #12141 fails because it tries to get all the events at once, these are both wrong. I'll re-implement this with DPL timer input then. I'll need a couple of days for that, should be ready not later than mid-next-week. You can proceed with #12140 for now, bearing in mind that it only publishes a fraction of events. Proper implementation will not break the workflows.

@aalkin aalkin closed this Oct 25, 2023
@DanielSamitz
Copy link
Contributor

OK, thank you very much for taking care of this.

You can proceed with #12140 for now, bearing in mind that it only publishes a fraction of events.

Btw, I tested #12140 and I do get ALL events, not only a fraction (that was exactly the point that I wanted to fix with it. Before, it was also sending the correct number of nEvents, but always a repetition of the same first aggregate events again and again.).

@aalkin
Copy link
Member Author

aalkin commented Oct 25, 2023

Interesting, then how come there is no out-of-memory issue?

@DanielSamitz
Copy link
Contributor

I don't know. I don't understand these things well enough, I can only say what I see. And it seems that the events do get shipped in packages of aggregate events at a time (i.e. what I see is that the run function in my analysis task gets called nEvents/aggregate times, each time processing aggregate events, until all events are done.)

@aalkin aalkin deleted the kine-publisher-fix branch January 16, 2024 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants