Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyinstrument grouping is incorrect #19

Open
tamuri opened this issue May 27, 2024 · 5 comments
Open

Pyinstrument grouping is incorrect #19

tamuri opened this issue May 27, 2024 · 5 comments
Assignees

Comments

@tamuri
Copy link
Collaborator

tamuri commented May 27, 2024

Not sure whether this can be fixed or corrected at all - probably not - but I noticed that the indentation of samples in the pyinstrument output is incorrect. Regular events are being placed as if they are being run by the health system scheduler. Any ideas?

Screenshot 2024-05-27 at 11 48 51
@willGraham01
Copy link
Collaborator

This might be to do with the cross-class function calls in the codebase, and how the function which sets off the population level events is called?

  • HealthSystemScheduler.apply calls HealthSystemScheduler.process_events_mode_2.
  • HealthSystemScheduler.process_events_mode_2 calls HealthSystem.run_population_level_events. This is another case where we have a circular reference between the scheduler and and HealthSystem instance.
  • HealthSystem.run_population_level_events then calls run on each of the population level events due on the given day.

So the population level events are being run by HealthSystem.run_population_level_events, but via a method which is only ever called via HealthSystemScheduler.module (which points to the HealthSystem) within HealthSystemScheduler.process_events_mode_2.

So I guess PyInstrumment is technically right that the pop-level events are run within the scheduler class, though I'd now expect the population-level events to be nested within HealthSystemScheduler.process_events_mode_2 if I'm following the call stack, rather than one level higher 😅

@tamuri
Copy link
Collaborator Author

tamuri commented Jun 4, 2024

I don't think that's right. For example, the event HealthSeekingBehaviourPoll is not an HSI event, and is not run by the HealthSystem. It is processed in the main Simulation event loop, so I'd expect it to be under Simulation.fire_single_event.

@tamuri
Copy link
Collaborator Author

tamuri commented Jun 4, 2024

Compare with cprofile dump:

Screenshot 2024-06-04 at 07 25 18

@matt-graham
Copy link
Collaborator

matt-graham commented Oct 2, 2024

I've looked in to this a bit and think I've identified what is going on. The key issue seems to be that the class name used in the identifier for a frame corresponding to a method call will be written out as a particular subclass (specifically the mostly commonly observed subclass in the captured frames) even if the method being called is defined on a base class.

As a simple minimal reproducer, consider the following script corresponding to a very simplified version of the TLOmodel simulation call stack:

import time


def simulate():
    time.sleep(0.1)
    events = [EventA(), EventB(), EventA()] * 10
    for event in events:
        event.run()


class Event:
    
    def run(self):
        self.apply()


class EventA(Event):
    
    def apply(self):
        time.sleep(0.1)


class EventB(Event):
    
    def apply(self):
        time.sleep(0.1)
    

if __name__ == "__main__":
    simulate()

Running pyinstrument on this script (saved as pyinstrument_test.py) gives the following output using the console renderer:

  _     ._   __/__   _ _  _  _ _/_   Recorded: 15:46:49  Samples:  31
 /_//_/// /_\ / //_// / //_'/ //     Duration: 3.110     CPU time: 0.005
/   _/                      v4.7.3

Program: pyinstrument_test.py

3.110 <module>  pyinstrument_test.py:1
└─ 3.110 simulate  pyinstrument_test.py:4
   ├─ 3.009 EventA.run  pyinstrument_test.py:13
   │  ├─ 2.006 EventA.apply  pyinstrument_test.py:19
   │  │  └─ 2.006 sleep  <built-in>
   │  └─ 1.003 EventB.apply  pyinstrument_test.py:25
   │     └─ 1.003 sleep  <built-in>
   └─ 0.101 sleep  <built-in>

Here we see that underneath the top-level simulate call, all event apply method calls are grouped under a frame with identifier EventA.run. This corresponds in reality to the run method defined on the base Event class, which we can see from the correct line number reference in to the script. However as the concrete classes from which the run method is called are EventA and EventB, when recording these frames pyinstrument appears to record both these as the class associated with the method call and then resolve the ambiguity when rendering the output, just outputs the mosts commonly observed class name:

https://github.com/joerick/pyinstrument/blob/4b37f8cdc531be41a7f7e57932f0b770244025d5/pyinstrument/frame.py#L308-L309

https://github.com/joerick/pyinstrument/blob/4b37f8cdc531be41a7f7e57932f0b770244025d5/pyinstrument/frame.py#L284-L305

This seems like a bug to me or at least is a bit misleading so I'll raise an issue on pyinstrument repository and see if we can get fixed.

Importantly it does seem however the overall call tree recorded is correct and if we just look at the file / line number references in the existing output we can correctly identify whether methods are actually defined.

@matt-graham
Copy link
Collaborator

matt-graham commented Oct 2, 2024

Accidentally hit close while writing comment 🤦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants