Class names in identifiers in rendered output for aggregated frames corresponding to method calls can be misleading #340

matt-graham · 2024-10-02T15:28:35Z

We hit an issue when using pyinstrument to regularly profile a large epidemiological simulation code that the rendered profile output seemed to be grouping method calls incorrectly:

UCL/TLOmodel-profiling/issues/19

Specifically we were finding calls to functions (/ methods) being grouped under class methods which we thought shouldn't ever be able to call these functions.

I've looked in to this a bit and think I've identified what is going on. The key issue seems to be that the class name used in the identifier for a (aggregated) frame corresponding to a method call will be written out as the name of particular subclass (specifically the mostly commonly observed subclass in the aggregated captured frames) even if the method being called is defined on a base class.

As a minimal reproducer, consider the following script corresponding to a very simplified version of our simulation call stack:

import time


def simulate():
    events = [EventA(), EventB(), EventA()] * 10
    for event in events:
        event.run()


class Event:
    
    def run(self):
        self.apply()


class EventA(Event):
    
    def apply(self):
        time.sleep(0.1)


class EventB(Event):
    
    def apply(self):
        time.sleep(0.1)
    

if __name__ == "__main__":
    simulate()

Running pyinstrument on this script (saved as pyinstrument_test.py) gives the following output using the console renderer:

  _     ._   __/__   _ _  _  _ _/_   Recorded: 16:12:19  Samples:  31
 /_//_/// /_\ / //_// / //_'/ //     Duration: 3.015     CPU time: 0.007
/   _/                      v4.7.3

Program: pyinstrument_test.py

3.013 <module>  pyinstrument_test.py:1
└─ 3.013 simulate  pyinstrument_test.py:4
   └─ 3.013 EventA.run  pyinstrument_test.py:12
      ├─ 2.009 EventA.apply  pyinstrument_test.py:18
      │  └─ 2.009 sleep  <built-in>
      └─ 1.005 EventB.apply  pyinstrument_test.py:24
         └─ 1.005 sleep  <built-in>

Here we see that underneath the top-level simulate call, all event apply method calls are grouped under a frame with identifier EventA.run. This corresponds in reality to the run method defined on the base Event class, which we can see from the correct line number reference in to the script. In this case it is fairly easy to spot that there is no run method defined directly in EventA and so that this in reality corresponds to Event, but in our real use case the call stack is a lot more complex and spotting when method calls actually correspond to calls to a base class method is more difficult.

My understanding of how this arises, is that as the concrete classes from which the run method is called are EventA and EventB, when the corresponding frames are subsequently aggregated pyinstrument resolves the ambiguity of which class name to use when rendering by outputting the most commonly observed class name based on the logic in:

pyinstrument/pyinstrument/frame.py

Lines 308 to 309 in 4b37f8c

    
           def class_name(self) -> str | None: 
        
               return self.get_attribute_value(ATTRIBUTE_MARKER_CLASS_NAME)

pyinstrument/pyinstrument/frame.py

Lines 284 to 305 in 4b37f8c

    
               def get_attribute_value(self, attribute_marker: str) -> str | None: 
        
                   """ 
        
                   Returns the value of the attribute. If multiple values are present, 
        
                   the most commonly observed one is returned. 
        
                   """ 
        
                   # Attributes are recorded as a dict, with the key representing an 
        
                   # observation, and the value representing the duration that it was 
        
                   # observed. the first character of the observation is the 'marker' - 
        
                   # the type of the attribute, the rest is data. 
        
                   matching_attributes = [ 
        
                       a_tuple 
        
                       for a_tuple in self.attributes.items() 
        
                       if a_tuple[0].startswith(attribute_marker) 
        
                   ] 
        
                   if len(matching_attributes) == 0: 
        
                       return None 
        
                   top_attribute, _ = max(matching_attributes, key=lambda a: a[1]) 
        
                   # strip off the marker, return the data 
        
                   return top_attribute[1:]

Ideally instead of the above output, for our purposes it would be useful to instead get

  _     ._   __/__   _ _  _  _ _/_   Recorded: 16:12:19  Samples:  31
 /_//_/// /_\ / //_// / //_'/ //     Duration: 3.015     CPU time: 0.007
/   _/                      v4.7.3

Program: pyinstrument_test.py

3.013 <module>  pyinstrument_test.py:1
└─ 3.013 simulate  pyinstrument_test.py:4
   └─ 3.013 Event.run  pyinstrument_test.py:12
      ├─ 2.009 EventA.apply  pyinstrument_test.py:18
      │  └─ 2.009 sleep  <built-in>
      └─ 1.005 EventB.apply  pyinstrument_test.py:24
         └─ 1.005 sleep  <built-in>

that is the class name qualifying the run method identifier being specified as the base class Event it is defined in.

I am not clear however how feasible it would be to output the class name where methods are defined like this, and whether the current behaviour is intentional?

@joerick if you do think it would feasible and desirable to change the behaviour of pyinstrument to instead produce something more like the latter output, I'd be happy to try to work on a PR to try to implement this!

The text was updated successfully, but these errors were encountered:

joerick · 2024-10-02T20:46:05Z

Ooh yes this is an excellent point. Yes I totally agree that the current approach could be improved. The current method looks for a parameter in the function called self or cls and logs the type of that.

This is implemented in C, but you can see a Python version here, which is roughly similar:

pyinstrument/pyinstrument/low_level/stat_profile_python.py

Lines 123 to 132 in 17ce4c4

    
           class_name = None 
        
           # try to find self argument for methods 
        
           self = frame.f_locals.get("self", None) 
        
           if self and hasattr(self, "__class__") and hasattr(self.__class__, "__qualname__"): 
        
               class_name = self.__class__.__qualname__ 
        
           else: 
        
               # also try to find cls argument for class methods 
        
               cls = frame.f_locals.get("cls", None) 
        
               if cls and hasattr(cls, "__qualname__"): 
        
                   class_name = cls.__qualname__

One limitation is that it does need to be fast, because it will be called 1000s of times per second, potentially.

Yes if you have an opportunity to look into other approaches, I'd be very interested to see what you can find out!

matt-graham · 2024-10-03T13:48:35Z

For Python 3.11 onwards, I think the codeobject.co_qualname attribute gives what is needed directly without needing to look for self and cls in frame.f_locals. I think the following reimplementation of the snippet in your comment from the Python verstion of get_frame_info would then work on both Python 3.11 and above (where frame.f_code.co_qualname should be available) and earlier versions (where we can extract the qualified name from the __qualname__ attribute of the method itself (with __qualname__ being available for Python 3.3 and above):

    code = frame.f_code
    class_name = None
    if hasattr(code, "co_qualname") and code.co_qualname != code.co_name:
        class_name = class_name_from_qualified_name(code.co_qualname)
    else:
        self_or_cls = frame.f_locals.get("self", frame.f_locals.get("cls", None))
        if self_or_cls and hasattr(self_or_cls, code.co_name):
            method = getattr(self_or_cls, code.co_name)
            if hasattr(method, "__qualname__"):
                class_name = class_name_from_qualified_name(method.__qualname__)

Here class_name_from_qualified_name would be defined something like

def class_name_from_qualified_name(qualified_name: str) -> str:
    return ".".join(qualified_name.split(".")[:-1])

which should deal with also giving the 'qualified' class name for nested class definitions.

It looks like co_qualname is also available as a field in PyCodeObject in C from Python 3.11 onwards

https://github.com/python/cpython/blob/656b7a3c83c79f99beac950b59c47575562ea729/Objects/codeobject.c#L466

and as _get_class_name_of_frame is already defined differently for Python 3.11+ in

pyinstrument/pyinstrument/low_level/stat_profile.c

Lines 278 to 280 in 17ce4c4

    
           #if PY_VERSION_HEX >= 0x030b0000 // Python 3.11.0 
        
           static const char * 
        
           _get_class_name_of_frame(PyFrameObject *frame, PyCodeObject *code) {

I think we could implement something analogous in C, with Python 3.11+ implementation using co_qualname field of PyCodeObject (from reading the comments this possibly also has the side benefit of avoiding the expense of calling PyFrame_GetLocals). The alternative implementation for earlier Python versions could I think do something comparable to Python implementation above using PyObject_HasAttr and PyObject_GetAttr after getting the self or cls PyObject as in the current implementation, but I'm not sure if there is a big performance penalty to using those C API calls?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Class names in identifiers in rendered output for aggregated frames corresponding to method calls can be misleading #340

Class names in identifiers in rendered output for aggregated frames corresponding to method calls can be misleading #340

matt-graham commented Oct 2, 2024

joerick commented Oct 2, 2024 •

edited

Loading

matt-graham commented Oct 3, 2024 •

edited

Loading

Class names in identifiers in rendered output for aggregated frames corresponding to method calls can be misleading #340

Class names in identifiers in rendered output for aggregated frames corresponding to method calls can be misleading #340

Comments

matt-graham commented Oct 2, 2024

joerick commented Oct 2, 2024 • edited Loading

matt-graham commented Oct 3, 2024 • edited Loading

joerick commented Oct 2, 2024 •

edited

Loading

matt-graham commented Oct 3, 2024 •

edited

Loading