Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore Nested trace trees #695

Closed
Fidget-Spinner opened this issue Sep 11, 2024 · 2 comments
Closed

Explore Nested trace trees #695

Fidget-Spinner opened this issue Sep 11, 2024 · 2 comments
Assignees

Comments

@Fidget-Spinner
Copy link
Collaborator

Sub-issue to #621

Basically when we start tracing, if we bump into another trace tree, we nest that trace tree into our current trace by treating it as a _EXIT_TRACE, instead of tracing through it.

This idea is not new. One of the original authors of production tracing JITs (Andreas Gal) pointed this out in their blog:

https://andreasgal.com/2008/08/22/tracing-the-web/ (See section on "Trace Trees and Nested Trace Trees")

Consider the following loop:

def foo():
    for _ in range(32):
        for _ in range(64):
            pass

Right now this happens:

  1. At inner loop n = 16, Trace is created for inner loop of 64.
  2. At outer loop n = 16, Trace is created for outer loop of 32 that projects into loop of 64.

This isn't ideal because now the outer loop creates a trace that never stays in tier 2 for long. It always exits at the end of each iteration.
Instead what we could do is nest the trace trees.

For inspection, this is the trace output I get:

# Inner loop
   0 OPTIMIZED: _START_EXECUTOR (0, target=29, operand=0x7b342ce59490)
   1 OPTIMIZED: _SET_IP (0, target=29, operand=0x7b342d963582)
   2 OPTIMIZED: _CHECK_PERIODIC (0, jump_target=0, operand=0, error_target=9)
   3 OPTIMIZED: _CHECK_VALIDITY (0, jump_target=10, operand=0x7b342d96357c)
   4 OPTIMIZED: _ITER_CHECK_RANGE (3, jump_target=11, operand=0)
   5 OPTIMIZED: _GUARD_NOT_EXHAUSTED_RANGE (3, jump_target=12, operand=0)
   6 OPTIMIZED: _ITER_NEXT_RANGE (3, jump_target=0, operand=0, error_target=13)
   7 OPTIMIZED: _STORE_FAST_0 (0, target=28, operand=0)
   8 OPTIMIZED: _JUMP_TO_TOP (0, jump_target=1, operand=0)
   9 OPTIMIZED: _ERROR_POP_N (0, target=0, operand=0x1d)
  10 OPTIMIZED: _DEOPT (0, target=26, operand=0)
  11 OPTIMIZED: _EXIT_TRACE (0, target=26, operand=0x7b342ce59510)
  12 OPTIMIZED: _EXIT_TRACE (0, target=32, operand=0x7b342ce59520)
  13 OPTIMIZED: _ERROR_POP_N (1, target=0, operand=0x1a)
# Outer loop
   0 OPTIMIZED: _START_EXECUTOR (0, target=33, operand=0x5a83ffcfced0)
   1 OPTIMIZED: _SET_IP (0, target=33, operand=0x7b342d96358a)
   2 OPTIMIZED: _CHECK_PERIODIC (0, jump_target=0, operand=0, error_target=22)
   3 OPTIMIZED: _CHECK_VALIDITY (0, jump_target=23, operand=0x7b342d963560)
   4 OPTIMIZED: _ITER_CHECK_RANGE (21, jump_target=24, operand=0)
   5 OPTIMIZED: _GUARD_NOT_EXHAUSTED_RANGE (21, jump_target=25, operand=0)
   6 OPTIMIZED: _ITER_NEXT_RANGE (21, jump_target=0, operand=0, error_target=26)
   7 OPTIMIZED: _STORE_FAST_0 (0, target=14, operand=0)
   8 OPTIMIZED: _CHECK_FUNCTION (1, jump_target=27, operand=0x2cc)
   9 OPTIMIZED: _LOAD_CONST_INLINE_BORROW_WITH_NULL (1, target=15, operand=0x5a83ff67b800)
  10 OPTIMIZED: _LOAD_CONST_INLINE_BORROW (0, target=20, operand=0x5a83ff6bc7f0)
  11 OPTIMIZED: _SET_IP (0, target=21, operand=0x7b342d963572)
  12 OPTIMIZED: _CALL_BUILTIN_CLASS (1, jump_target=28, operand=0, error_target=29)
  13 OPTIMIZED: _CHECK_PERIODIC (1, jump_target=0, operand=0, error_target=30)
  14 OPTIMIZED: _CHECK_VALIDITY_AND_SET_IP (0, jump_target=31, operand=0x7b342d96357a)
  15 OPTIMIZED: _GET_ITER (0, jump_target=0, operand=0, error_target=32)
  16 OPTIMIZED: _CHECK_VALIDITY (0, jump_target=33, operand=0x7b342d96357c)
  17 OPTIMIZED: _ITER_CHECK_RANGE (3, jump_target=34, operand=0)
  18 OPTIMIZED: _GUARD_NOT_EXHAUSTED_RANGE (3, jump_target=35, operand=0)
  19 OPTIMIZED: _ITER_NEXT_RANGE (3, jump_target=0, operand=0, error_target=36)
  20 OPTIMIZED: _STORE_FAST_0 (0, target=28, operand=0)
  21 OPTIMIZED: _EXIT_TRACE (0, target=29, operand=0x5a83ffcfcf50)
  22 OPTIMIZED: _ERROR_POP_N (0, target=0, operand=0x21)
  23 OPTIMIZED: _DEOPT (0, target=12, operand=0)
  24 OPTIMIZED: _EXIT_TRACE (0, target=12, operand=0x5a83ffcfcf60)
  25 OPTIMIZED: _EXIT_TRACE (0, target=36, operand=0x5a83ffcfcf70)
  26 OPTIMIZED: _ERROR_POP_N (1, target=0, operand=0xc)
  27 OPTIMIZED: _DEOPT (0, target=15, operand=0)
  28 OPTIMIZED: _DEOPT (0, target=21, operand=0)
  29 OPTIMIZED: _ERROR_POP_N (3, target=0, operand=0x15)
  30 OPTIMIZED: _ERROR_POP_N (0, target=0, operand=0x15)
  31 OPTIMIZED: _DEOPT (0, target=25, operand=0)
  32 OPTIMIZED: _ERROR_POP_N (1, target=0, operand=0x19)
  33 OPTIMIZED: _DEOPT (0, target=26, operand=0)
  34 OPTIMIZED: _EXIT_TRACE (0, target=26, operand=0x5a83ffcfcf80)
  35 OPTIMIZED: _EXIT_TRACE (0, target=32, operand=0x5a83ffcfcf90)
  36 OPTIMIZED: _ERROR_POP_N (1, target=0, operand=0x1a)

@Fidget-Spinner
Copy link
Collaborator Author

Note: we already sort of do this, but only when we transiently see an executor. So it doesn't always happen.

https://github.com/python/cpython/blob/main/Python/optimizer.c#L596

@Fidget-Spinner Fidget-Spinner self-assigned this Sep 11, 2024
@Fidget-Spinner
Copy link
Collaborator Author

Fidget-Spinner commented Sep 11, 2024

Brandt pointed out that we already did this. 🤦

@Fidget-Spinner Fidget-Spinner closed this as not planned Won't fix, can't repro, duplicate, stale Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant