-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115999: Specialize STORE_ATTR
in free-threaded builds.
#127838
gh-115999: Specialize STORE_ATTR
in free-threaded builds.
#127838
Conversation
STORE_ATTR
free-threaded builds.STORE_ATTR
in free-threaded builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on! This looks like a regression on the default build. I haven't had a chance to dig into it, but I suspect it might either be due to the check that Sam flagged in _STORE_ATTR_INSTANCE_VALUE
or the change to when we read the type version in _Py_Specialize_StoreAttr
. It looks like the richards benchmark is the most heavily affected, so that might be a good isolated benchmark to use for debugging.
Based on my benchmarking my second commit, the regression with Regarding the |
This was actually not hard to fix. I was thinking that |
f920dcd
to
4c484ab
Compare
I rebased on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of small comments inline, but LGTM overall.
* Fix locking for `STORE_ATTR_INSTANCE_VALUE`. Create `_GUARD_TYPE_VERSION_AND_LOCK` so that object stays locked and `tp_version_tag` cannot change. Fix inverted logic bug that caused erroneous deopt. * Fix locking for `_STORE_ATTR_WITH_HINT`. Double check that `_PyObject_GetManagedDict()` hasn't disappeared since we locked the dict. - Pass `tp_version_tag` to `specialize_dict_access()`, ensuring the version we store on the cache is the correct one (in case of it changing during the specalize analysis). - Split `analyze_descriptor` into `analyze_descriptor_load` and `analyze_descriptor_store` since those don't share much logic. Add `descriptor_is_class` helper function. - In `specialize_dict_access`, double check `_PyObject_GetManagedDict()` in case we race and dict was materialized before the lock.
If the type is new and a version tag hasn't yet been assigned, we would fail to specialize it. Use `_PyType_LookupRefAndVersion()` instead of `type_get_version()`, which will assign a version.
Use provided value of `tp_version` to store in cache.
This also fixes the case if the dict is replaced with a different one.
The function is only used in with-GIL builds.
For `specialize_dict_access_inline()`, we need to lock the keys object. * Add `_PyDictKeys_StringLookupSplit` which does required locking and use in place of `_PyDictKeys_StringLookup`. * Change `_PyObject_TryGetInstanceAttribute` to use that function in the case of split keys. * Add `unicodekeys_lookup_split` helper which allows code sharing between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.
699f4e9
to
6bf9016
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just a couple of small comments inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…thongh-127838) * Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and use in place of `_PyDictKeys_StringLookup`. * Change `_PyObject_TryGetInstanceAttribute` to use that function in the case of split keys. * Add `unicodekeys_lookup_split` helper which allows code sharing between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`. * Fix locking for `STORE_ATTR_INSTANCE_VALUE`. Create `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and `tp_version_tag` cannot change. * Pass `tp_version_tag` to `specialize_dict_access()`, ensuring the version we store on the cache is the correct one (in case of it changing during the specalize analysis). * Split `analyze_descriptor` into `analyze_descriptor_load` and `analyze_descriptor_store` since those don't share much logic. Add `descriptor_is_class` helper function. * In `specialize_dict_access`, double check `_PyObject_GetManagedDict()` in case we race and dict was materialized before the lock. * Avoid borrowed references in `_Py_Specialize_StoreAttr()`. * Use `specialize()` and `unspecialize()` helpers. * Add unit tests to ensure specializing happens as expected in FT builds. * Add unit tests to attempt to trigger data races (useful for running under TSAN). * Add `has_split_table` function to `_testinternalcapi`.
STORE_ATTR_INSTANCE_VALUE
,STORE_ATTR_SLOT
,STORE_ATTR_WITH_HINT
). Need a combination of locks and atomics to be safe._Py_Specialize_StoreAttr
. Avoid using borrowed references. Save and store thetp_version_tag
from the beginning of the specialization process since it might change. Use helper functions to update opcode.--disable-gil
builds #115999