Pyinstrument grouping is incorrect #19

tamuri · 2024-05-27T10:52:06Z

Not sure whether this can be fixed or corrected at all - probably not - but I noticed that the indentation of samples in the pyinstrument output is incorrect. Regular events are being placed as if they are being run by the health system scheduler. Any ideas?

willGraham01 · 2024-06-03T08:58:50Z

This might be to do with the cross-class function calls in the codebase, and how the function which sets off the population level events is called?

HealthSystemScheduler.apply calls HealthSystemScheduler.process_events_mode_2.
HealthSystemScheduler.process_events_mode_2 calls HealthSystem.run_population_level_events. This is another case where we have a circular reference between the scheduler and and HealthSystem instance.
HealthSystem.run_population_level_events then calls run on each of the population level events due on the given day.

So the population level events are being run by HealthSystem.run_population_level_events, but via a method which is only ever called via HealthSystemScheduler.module (which points to the HealthSystem) within HealthSystemScheduler.process_events_mode_2.

So I guess PyInstrumment is technically right that the pop-level events are run within the scheduler class, though I'd now expect the population-level events to be nested within HealthSystemScheduler.process_events_mode_2 if I'm following the call stack, rather than one level higher 😅

tamuri · 2024-06-04T00:02:42Z

I don't think that's right. For example, the event HealthSeekingBehaviourPoll is not an HSI event, and is not run by the HealthSystem. It is processed in the main Simulation event loop, so I'd expect it to be under Simulation.fire_single_event.

tamuri · 2024-06-04T06:27:25Z

Compare with cprofile dump:

matt-graham · 2024-10-02T14:53:35Z

I've looked in to this a bit and think I've identified what is going on. The key issue seems to be that the class name used in the identifier for a frame corresponding to a method call will be written out as a particular subclass (specifically the mostly commonly observed subclass in the captured frames) even if the method being called is defined on a base class.

As a simple minimal reproducer, consider the following script corresponding to a very simplified version of the TLOmodel simulation call stack:

import time


def simulate():
    time.sleep(0.1)
    events = [EventA(), EventB(), EventA()] * 10
    for event in events:
        event.run()


class Event:
    
    def run(self):
        self.apply()


class EventA(Event):
    
    def apply(self):
        time.sleep(0.1)


class EventB(Event):
    
    def apply(self):
        time.sleep(0.1)
    

if __name__ == "__main__":
    simulate()

Running pyinstrument on this script (saved as pyinstrument_test.py) gives the following output using the console renderer:

  _     ._   __/__   _ _  _  _ _/_   Recorded: 15:46:49  Samples:  31
 /_//_/// /_\ / //_// / //_'/ //     Duration: 3.110     CPU time: 0.005
/   _/                      v4.7.3

Program: pyinstrument_test.py

3.110 <module>  pyinstrument_test.py:1
└─ 3.110 simulate  pyinstrument_test.py:4
   ├─ 3.009 EventA.run  pyinstrument_test.py:13
   │  ├─ 2.006 EventA.apply  pyinstrument_test.py:19
   │  │  └─ 2.006 sleep  <built-in>
   │  └─ 1.003 EventB.apply  pyinstrument_test.py:25
   │     └─ 1.003 sleep  <built-in>
   └─ 0.101 sleep  <built-in>

Here we see that underneath the top-level simulate call, all event apply method calls are grouped under a frame with identifier EventA.run. This corresponds in reality to the run method defined on the base Event class, which we can see from the correct line number reference in to the script. However as the concrete classes from which the run method is called are EventA and EventB, when recording these frames pyinstrument appears to record both these as the class associated with the method call and then resolve the ambiguity when rendering the output, just outputs the mosts commonly observed class name:

https://github.com/joerick/pyinstrument/blob/4b37f8cdc531be41a7f7e57932f0b770244025d5/pyinstrument/frame.py#L308-L309

https://github.com/joerick/pyinstrument/blob/4b37f8cdc531be41a7f7e57932f0b770244025d5/pyinstrument/frame.py#L284-L305

This seems like a bug to me or at least is a bit misleading so I'll raise an issue on pyinstrument repository and see if we can get fixed.

Importantly it does seem however the overall call tree recorded is correct and if we just look at the file / line number references in the existing output we can correctly identify whether methods are actually defined.

matt-graham · 2024-10-02T14:53:57Z

Accidentally hit close while writing comment 🤦

tamuri assigned willGraham01 May 27, 2024

matt-graham closed this as completed Oct 2, 2024

matt-graham reopened this Oct 2, 2024

matt-graham mentioned this issue Oct 2, 2024

Class names in identifiers in rendered output for aggregated frames corresponding to method calls can be misleading joerick/pyinstrument#340

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pyinstrument grouping is incorrect #19

Pyinstrument grouping is incorrect #19

tamuri commented May 27, 2024

willGraham01 commented Jun 3, 2024

tamuri commented Jun 4, 2024

tamuri commented Jun 4, 2024

matt-graham commented Oct 2, 2024 •

edited

Loading

matt-graham commented Oct 2, 2024 •

edited

Loading

Pyinstrument grouping is incorrect #19

Pyinstrument grouping is incorrect #19

Comments

tamuri commented May 27, 2024

willGraham01 commented Jun 3, 2024

tamuri commented Jun 4, 2024

tamuri commented Jun 4, 2024

matt-graham commented Oct 2, 2024 • edited Loading

matt-graham commented Oct 2, 2024 • edited Loading

matt-graham commented Oct 2, 2024 •

edited

Loading

matt-graham commented Oct 2, 2024 •

edited

Loading