Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grand Unified Python Object Layout #553

Open
markshannon opened this issue Feb 16, 2023 · 14 comments
Open

Grand Unified Python Object Layout #553

markshannon opened this issue Feb 16, 2023 · 14 comments

Comments

@markshannon
Copy link
Member

This is likely to need to end up as PEP, but I want to sketch out the idea here first.

General idea

All Python objects should share a common layout that makes the layout visible to the VM, and provides the performance needed by C extensions. In order to ensure that it is fast enough we should use it for internal classes, like tuples, lists and dicts. If it is fast enough for those classes, it will be good enough for NumPy and Cython. Persuading them to use a new API will be another issue, but we need to provide the capability first.

Requirements

Performance

Important classes ("important" must be user defined, or it won't get used) should be able to get at least the speed of manually defined C code. Because the VM will have visibility into the classes, we should be able to get better performance in some cases.

Composability

Most classes should be freely composable, including through multiple inheritance.

Flexibility

The design shouldn't put unnecessary constraints on either the VM implementation or the C extension code.

Conceptual interface

One layout per class

The layout of a Python object depends on its class. All instances of a class have the same layout, from the point of view of C extensions (the VM may use "object shapes" or some other optimization resulting in differing physical layout in a way that is not visible to C extensions.

C extensions define a C struct and requirements, not layout

When defining a C extension type, the C struct will be described to the VM in terms of size and alignment, and the details of accessible fields in it, much like PEP 697, but without the control over layout that PEP 697 gives. Layout is up to the VM.

Layouts aren't inherited

If class C inherits from class B, its layout is not inherited. In other words if class B defines a struct S, the offset of S within an object C() may differ from the offset of S within an object B().

Extension class properties

Each extension defined class can have the following properties, which are not inherited.

  • HasLength: The objects have length, e.g. list, tuple
  • VariableSized: The struct is variable sized, e.g. tuple (but not dict or list). Implies HasLength
  • Primary: The struct is "important", performance matters more than the ability to support multiple inheritance well.

No class can have more than one class in its MRO (including itself) which each of the above property.

The C-API.

To get the offset of a struct:

intptr_t PyObject_GetStructOffset(PyTypeObject *self_class, PyTypeObject *declaring_class);
will return the offset, in bytes, from the owning PyObject * to the requires struct.
The offset may be negative, a return value of -1 indicates an error.

PyObject_GetStructOffset is a pure function of the triple (sys.implementation, self_class, declaring_class) meaning that the offset may be cached globally during execution but should not be cached in source code or to disk.

The offset of the start of primary structs is an unstable API constant

const intptr_t PyUnstable_PRIMARY_STRUCT_OFFSET is a compile-time constant (which I expected to always be positive, but lets leave that open for now)

This allows "primary" classes to get performance on a par with the current implementation of, for example, PyTupleObject.

Declaring fields.

If we want to expose a C field to the VM, we need to declare its type, and offset.
In addition we want to declare various attributes, much as are already declared in PyMemberDef
One additional capability we would like to add, that PyMemberDef doesn't have, is to have fields that can be directly read, but have a function setter, which suggests the following struct:

struct PyAttributeDef {
    const char *name;
    const char *doc;
    uint32_t offset;
    int16_t flags;
    uint8_t type;
    getter get;
    setter set;
};

If get is NULL, it implies the field can be read directly, unless the WRITE_ONLY flag is set.
If set is NULL, it implies the field can be written directly, unless the READ_ONLY flag is set.

A possible implementation.

Note: this is not part of the spec. I include this only to demonstrate that this can be implemented. The actual implementation may well differ.

Objects are laid as follows:

  • 0 or more non-variable-sized, non-primary structs
  • The core object (class, refcount, __dict__/values, __weakrefs__, etc.)
  • The primary struct (if there is one)
  • The variable-sized struct (if there is one) (May be combined with the primary struct, if class is both primary and variable-sized.

The object pointer (PyObject *) points into the middle of the the core object, as it has done since the cycle GC was introduced.
The offset from the object pointer to the end of the core object will be a multiple of the maximum C alignment and will be PyUnstable_PRIMARY_STRUCT_OFFSET.

Example.

We stated above that if the "Grand Unified Python Object Layout" could not support classes like tuple or dict, it would not be reasonable to expect C extensions to use it.

We would define tuple with the following struct:

struct tuple_data {
    uintptr_t size;
    PyObject *items[1];
};

And declare that tuple is a primary, variable-sized type.

We can then define the accessor functions efficiently as:

static inline PyObject *PyTupleStruct_GetItem(struct tuple_data *tuple, intptr_t index) ...

PyObject *PyTuple_GetItem(PyObject *op, intptr_t index) 
{
    assert(PyTuple_Check(op));
    struct tuple_data *tuple = (struct tuple_data *)(((unsigned char *)op) + PyUnstable_PRIMARY_STRUCT_OFFSET);
    return PyTupleStruct_GetItem(tuple, index);
}

The length property could be defined as follows (we wouldn't do this in practice as ().length isn't a thing)

struct PyAttributeDef tuple_length = {
    .name = "length",
    .doc = "the length of the tuple",
    .offset = offsetof(struct tuple_data, size),
    .flags = READ_ONLY,
    .type = Py_T_UINTPTR_T
    .get = NULL,
    .set = NULL
};
@encukou
Copy link

encukou commented Feb 23, 2023

Sounds promising! It'd be a big chunk of work though.
You won't be surprised to hear that I don't think it can go in without some support for legacy “mystery layouts”, and structs like PyListObject will probably still need to be exposed as structs even if they use the new layout (it's in the tutorial, people use it).
You might want to separate adding the new API and deprecating/removing the old stuff, unless you want to somehow switch in a single release.

There might be some overlap between “legacy mystery layout” and “Primary”. Perhaps it would work to allow multiple Primaries in the MRO, with single inheritance only, if the subclass has “intimate knowledge” of the base's primary struct?
With that, this would be pretty much backwards-compatible (though it wouldn't help that much with changes to the core object fields, and I suspect that's part of your motivation).

The proposal looks essentially compatible with PEP 697:

  • PyObject_GetTypeData would be straightforward, using PyObject_GetStructOffset
  • PyType_GetTypeDataSize might not be straightforward, but should be stored/computable, and will probably be useful on its own
  • Py_TPFLAGS_ITEMS_AT_END would be unused, or set automatically if that would be useful
  • Py_RELATIVE_OFFSET would work nicely
  • PyObject_GetItemData could work with the new layout, rather than just with Py_TPFLAGS_ITEMS_AT_END -- perhaps the VM would need a bit more info, or even manage “size & items” (struct tuple_data), while allowing non-managed variable-sized types like int
  • negative tp_basicsize would add a struct to the appropriate place.

The offset of the start of primary structs is an unstable API constant

Could we have a PyObject_GetPrimaryStruct() function, and treat PyUnstable_PRIMARY_STRUCT_OFFSET as a lower-level thing? Let's abstract the raw pointer addition away where we can.

@encukou
Copy link

encukou commented Feb 23, 2023

HasLength: The objects have length, e.g. list, tuple
VariableSized: The struct is variable sized, e.g. tuple (but not dict or list). Implies HasLength

I read that as:

  • HasLength enables Py_SIZE.
  • VariableSized enables PyObject_NewVar, and implies that Py_SIZE stores the length

Considering a recent Discourse topic, we might want:

  • HasLength enables Py_SIZE.
  • VariableSized | HasLength enables PyObject_NewVar, and Py_SIZE (backed by a VM-managed field) stores the length.
  • VariableSized only enables PyObject_NewVar, but the user manages the size field (letting them e.g. pack additional bits into it).

@markshannon
Copy link
Member Author

markshannon commented Feb 23, 2023

You won't be surprised to hear that I don't think it can go in without some support for legacy “mystery layouts”,

Of course, everything we do needs to support mystery legacy [insert feature here]. It's just one bit of information, so shouldn't be a problem.

There might be some overlap between “legacy mystery layout” and “Primary”.

Maybe, although I'm inclined to just treat "legacy mystery layout” as a black box to start with.

Could we have a PyObject_GetPrimaryStruct() function, and treat PyUnstable_PRIMARY_STRUCT_OFFSET as a lower-level thing?

Sure, we can make PyUnstable_PRIMARY_STRUCT_OFFSET part of the "unlimited" API, and add PyObject_GetPrimaryStruct() to the limited API.

@markshannon
Copy link
Member Author

The proposal looks essentially compatible with PEP 697

I think there is a difference when subclassing a class of variable sized objects, V (eg. tuple) and another non-primary, non-variable class, C
In PEP 697, the data for C goes after the data for V, which means that the offset is a function of the MRO and the object.
In this proposal, the data for C goes before the object header, so is a function of just the MRO, meaning that it's fields can be accessed as fast any other slot (assuming an adaptive optimizer).

@markshannon
Copy link
Member Author

I read that as: HasLength enables Py_SIZE.

It is functionally equivalent, yes.
But Py_SIZE is broken, it makes an unchecked cast and hopes for the best. We can do better.
int is variable sized, but Py_SIZE gives the wrong value.

VariableSized enables PyObject_NewVar, and implies that Py_SIZE stores the length

No, because we can have a primary class and a variable sized class in an MRO, which means that variable sized data will not be in the primary position.

@markshannon
Copy link
Member Author

Regarding the size of variable sized structs:
We can reasonably restrict the size to confirm to the following equation: base_size + item_count * item_size, where base_size and item_size are fixed when the class is created.

However, we need to allow classes to compute item_count, as they might not store it directly.
So we need another flag, say HasManagedLength which delegates the storage of item_count to the VM for most classes that don't do anything funny with the item_count.
We also need to allow classes to provide their own get_item_count() functions.

@encukou
Copy link

encukou commented Feb 23, 2023

In PEP 697, the data for C goes after the data for V, which means that the offset is a function of the MRO and the object.

PEP 697 primarily specifies an API. It also, necessarily, describes how the API will work with (a subset of) the current class layouts.
The API can work with the layout proposed here.

Maybe, although I'm inclined to just treat "legacy mystery layout” as a black box to start with.

Maybe you can do that, but you can't ignore it entirely.
How does inheritance work between old black boxes and the new layout?

But Py_SIZE is broken, it makes an unchecked cast and hopes for the best. We can do better.

So, what happens to the Py_SIZE API? It's removed because it's not correct in some cases?

[edit: you just answered the following]

No, because we can have a primary class and a variable sized class in an MRO, which means that variable sized data will not be in the primary position.

There are two use cases:

  • Python knows where and how the size data is stored, so it can e.g. provide a correct default __sizeof__
  • Only the class knows where and how the size data is stored, so it can store it in creative ways

Does this proposal enable one of these, or both?

@encukou
Copy link

encukou commented Feb 23, 2023

What exactly does HasLength mean?

@grothesque
Copy link

Hello, I became aware of this discussion through https://discuss.python.org/t/24136 and would like to chime in on one aspect. Please excuse me if I have overseen other important aspects.

@markshannon wrote:

* HasLength: The objects have length, e.g. `list`, `tuple`

* VariableSized: The struct is variable sized, e.g. `tuple` (but not `dict` or `list`). Implies `HasLength`

Please note that having a length in the sense of __len__ and being a variable sized PyObject in the sense of PyVarObject are two entirely independent concepts.

The best example is the built-in PyLong which has no length in the sense of __len__ at all, but is still variable length. What's more, it doesn't store that length in the ob_size field of PyVarObject. (Instead it uses the sign that field to store the sign of the underlying number, and the absolute value to store the length.)

For other examples, consider a type representing an (immutable) tree, or a small multidimensional array: both can be usefully represented as variable size PyObjects that either have no __len__ or a __len__ that is unrelated to the size of their data payload.

That's why in a redesign of python object layout it may be worthwhile to consider getting rid of any equivalent of today's Py_SIZE and Py_SET_SIZE. It seems that these have very little use (providing a default implementation of __sizeof__?), and are actually "abused" even by built-in types to store payload.

@encukou
Copy link

encukou commented Oct 31, 2023

Declaring fields
If we want to expose a C field to the VM, we need to declare its type, and offset.
In addition we want to declare various attributes, much as are already declared in PyMemberDef
One additional capability we would like to add, that PyMemberDef doesn't have, is to have fields that can be directly read, but have a function setter, which suggests the following struct:

struct PyAttributeDef {
    const char *name;
    const char *doc;
    uint32_t offset;
    int16_t flags;
    uint8_t type;
    getter get;
    setter set;
};

This effectively merges PyMemberDef and PyGetSetDef. Yay for deduplication!

What's missing is a void* arg that's passed to the getter/setter.
Instead of that we could always pass a pointer to the declaring class (edit: and the PyAttributeDef), and make it easy to store C data on a class object.

@gvanrossum
Copy link
Collaborator

What's missing is a void* arg that's passed to the getter/setter.
Instead of that we could always pass a pointer to the declaring class, and make it easy to store C data on a class object.

Could you remind us of the use case for this?

@encukou
Copy link

encukou commented Oct 31, 2023

Telling multiple similar getters/setters apart.
It's general good practice to add such a context for callbacks. Here it looks like it'd be handy when the class is auto-generated -- you don't want to generate JIT-style parametrized functions. But yes, I'd need to check if it's actually used in the wild.

@encukou
Copy link

encukou commented Apr 10, 2024

See also: making struct _typeobject a plain C struct -- #672

@Raekye
Copy link

Raekye commented Jun 6, 2024

Could the "legacy mystery layout" of PyDictKeysObject, which has two variable-sized items/arrays (indices, and entries), be supported?

We stated above that if the "Grand Unified Python Object Layout" could not support classes like tuple or dict, it would not be reasonable to expect C extensions to use it.

I find the design of dict (/the underlying table) really nice, but it's not obvious to me how it would be implemented using this API. It feels like a general solution would need to do a lot of back-bending to accommodate this kind of layout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants