Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function version numbers do not obey necessary invariants #117086

Open
markshannon opened this issue Mar 20, 2024 · 5 comments
Open

Function version numbers do not obey necessary invariants #117086

markshannon opened this issue Mar 20, 2024 · 5 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@markshannon
Copy link
Member

markshannon commented Mar 20, 2024

Bug report

Bug description:

It is a necessary invariant of version numbers that an equal version number means that the objects are equivalent. That is, all operations on the two objects will give the same result.

This works for class and dict version numbers as no version number is shared between different objects and any change to the object results in a new version number.

However, we share version numbers for functions and that causes problems. #117051
We have good reasons for sharing version numbers. It allows much better optimization of closures. #98525

Want we want is a scheme that allows us to share version numbers, but preserves the invariant that the same function number means a function that is functionally identical.

Assuming that the function version gets modified whenever the function version is modified, if we want to share versions we only need to ensure the invariant when creating a function.
We do not need to ensure that all functions that are functionally identical share a version, just enough to optimize most common cases.

A function consists of the following fields that we care about when optimizing:

  • Code object
  • Globals
  • Builtins
  • Defaults (only the number, not the actual values)

It is MAKE_FUNCTION that is responsible for setting the version number., which we get only from the code object, meaning that if the globals, builtins or defaults differ from any other function with the same code object, the version number is invalid.

CPython versions tested on:

3.13, CPython main branch

Operating systems tested on:

No response

Linked PRs

@markshannon markshannon added the type-bug An unexpected behavior, bug, or error label Mar 20, 2024
@markshannon
Copy link
Member Author

When allocating a shared version number we need to check that not only is the code object the same, but that the globals, builtins and (number of) defaults is the same. Otherwise, the version number could be invalid.

We can ensure that everything is correct as follows:

  • Number of defaults: Store the expected number of defaults on the code object, and only issue a version if it matches.
  • Builtins: Only issue a version if the builtins is the interpreter's builtins.
  • Globals: Store the expected version number of the globals on the code object, and only issue a version if it matches.

However, we cannot know the version number of the globals when the code object is created, so we will need to initialize it lazily.

Here's a scheme for doing that:

  • On creation all code objects get a version number of zero.
  • When creating a function:
    • If the code object has a version, check globals, etc and issue matching version if everything matches
    • If code object has no version, assign it a version and initialize the global dict version and expected number of defaults from the function, then assign the same version to the function.

@markshannon markshannon self-assigned this Mar 20, 2024
@gvanrossum gvanrossum added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Mar 20, 2024
@gvanrossum
Copy link
Member

I'm not sure that we need to worry about the number of defaults. Those can only change by explicit assignment to func.__defaults__, and we can nuke the function object's version at that point (see #117028). There is no way to trick the interpreter into creating a function with a different number of default from the same code object, unlike globals (and, by implication, builtins).

@markshannon
Copy link
Member Author

The defaults can be changed via code.replace changing either the tuple of defaults or code object passed to MAKE_FUNCTION, but not the other. I think @brandtbucher had an example.

@gvanrossum
Copy link
Member

Brandt's example modified the function object by assigning a tuple of different length to __defaults__. All he got was the failing assert(defcount <= code->co_argcount); though. Careful analysis shows that if you remove that default, nothing else goes wrong. My PR (gh-117028) however just forces the function version to zero after such an assignment, and that also works (the specializer won't specialize the call, and already-specialized calls will fail the function version check).

@markshannon
Copy link
Member Author

The example I was thinking of is this which is just an assert.
We might want to optimize based on the defaults in future, so we might as well check that as well.

If we want to optimize on the exact defaults, not just the number, we will need to add a guard in code.replace().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants