-
-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build failure with 0.11.0 release on MacOS #373
Comments
Can you try one of the other allocators? I believe there is a jemallocator issue on 10.15 so may be the same for 10.14 |
@ryancinsight tried both with |
Strange...I put a pull request in last night and it appears to use macOS 10.15 and it should build bare packages for tests... |
Thanks for the report. This is an unintended regression of some kind in the Python distributions being used. I ran into all kinds of errors like this with the Python 3.9 distributions. But here we have a failure in the 3.8 distributions, which is mildly surprising.
I think I have enough info to track this down (the missing symbol names are a giant hint). But I don’t have a readily available <10.15 environment to use, so I may need help on this. I’ll let you know.
As a workaround, you can probably use an older Python distribution found from https://github.com/indygreg/python-build-standalone/releases using https://pyoxidizer.readthedocs.io/en/stable/config_type_python_distribution.html#config-python-distribution-init instead of `default_python_distribution()`. If you tell me the last working release, that could potentially help narrow my search!
|
So, apparently the last working distribution on MacOS 10.14 is this one: https://github.com/indygreg/python-build-standalone/releases/tag/20201020 @indygreg in case you need more tests, just ask :) |
That The build environment is supposed to target SDK 10.9+. And the Python 3.8 build environment in CI is supposed to use the 10.15 SDK (Python 3.9 uses the 11.1 SDK). That creates the question: where is the reference to the 11.0 SDK coming from? I do know the problem is on my end because Out of curiosity, what happens when you use the new default 3.9.2 distribution? If you get an |
Here's an example log line for the build of
As we can see But even if we are getting the 11.0 SDK,
So, uh, the rabbit hole goes deeper. |
I taught the distribution validator in python-build-standalone to sniff for this symbol and the results are peculiar:
It finds the symbol in |
And with validation of object files within archive files (read: static libraries):
There's our missing Now to figure out how this 11.0+ SDK symbol is getting introduced in the first place... |
@indygreg here it is, I would say kinda the same:
|
I confirmed that all object files with a reference to Perhaps the linker you are using (possibly an old one due to running on older macOS) doesn't support weak symbols? Or maybe Rust is invoking the linker to not allow weak linking (but I don't see a flag for that)? Regardless, the symbol shouldn't be introduced in the first place on Python 3.8 builds because we're using the 10.15 SDK and that symbol isn't defined until the 11.0 SDK. Somehow the build environment is actually using an 11.0+ SDK. I thought indygreg/python-build-standalone@5b01386 is related. |
I wonder if we should be linking with But I thought weakly referenced symbols would bypass the undefined symbol validation in the linker because... that's essentially how weakly referenced symbols work! We also have plenty of other weakly referenced symbols in the binaries and I'm pretty sure a lot of them aren't available in 10.9-10.15 SDKs. So why is your linker only complaining about the symbol introduced by the 11.0 SDK? I have a CI build at https://github.com/indygreg/python-build-standalone/actions/runs/625312878 that adds |
Our custom Clang build in CI is 100% using the 11.1 SDK:
These sysroots are almost certainly being baked into and used by the compiler, despite |
OK. So the way mach-o does weak symbols, the linker needs to find the symbol at link time so it can record the dylib it is present in so this information can be written out. I think this means that if we include a weakly referenced symbol from SDK version X, we'll need SDK version X present at link time ( I suspect we were getting lucky about this situation before because:
When PyOxidizer picked up Python distributions using weak linking, we effectively imposed a requirement that a modern SDK is used with PyOxidizer. That SDK must be at least as new as whatever the Python distribution [using weak symbols] was built with. That's currently the 11.1 SDK (although 11.0 should also work as I don't believe there are any new symbols in use from 11.1). (But again I thought we were building Python 3.8 with the 10.15 SDK and I may want to fix that.) So I think a workaround here is to build against the 11.0/11.1 SDK. Try downloading the 11.1 SDK and doing the following:
(You can find a copy of the Apple SDKS in the [legally dubious] https://github.com/phracker/MacOSX-SDKs/ repo if you don't want to install Xcode or the command line tools to get it.) (Edited to fix |
@ronaldoussoren I figured you may "enjoy" this issue. Don't feel obliged to comment. But if you want to, your wisdom would be greatly appreciated. |
I don't really have anything to add, in particular because I know nothing about PyOxidiser (other than knowing it exists and some vague notion about what it does). As others mentions The deployment target for the installers on python.org is macOS 10.9, but that's for the compiled binaries. AFAIK those installers don't ship separate We switched to the current setup with weak linking and building with a recent SDK for two reasons: (1) M1 support requires using Xcode 12 (and its included SDK), and (2) recent compilers generate better code and can use beter optimisation features (LTO and PGO) which we couldn't use with the compilers on macOS 10.9. This also allowed us to switch to a newer and much better release of Tcl/Tk. |
@indygreg quick update, tried with both 11.0 and 11.1 SDK, but it seems to be not enough to make the build successful:
Question: wouldn't make sense to build |
Thanks for chiming in @ronaldoussoren! The tl;dr for PyOxidizer is that it uses Python distributions produced from https://github.com/indygreg/python-build-standalone which also target 10.9+ on x86_64 (11.0+ on aarch64). It uses modern Clang and all 3rd party dependencies are static libraries. In addition to the Python installs in the archives, there are also object files plus a lot of metadata describing dependencies. PyOxidizer takes the object files and links a custom libpython and embeds that into a built [Rust] binary, effectively allowing you build your own Python distribution. I wasn't aware of the mach-o requirement that weak references be discoverable at build time (unlike ELF). In hindsight, I had been playing whack-a-mole with missing symbols errors over the months due to this. But I never realized because CPython didn't weak link until 3.9, PyOxidizer used 3.8 as the default distribution, and we got lucky with a combination of using an older SDK to build and users having modern SDKs on their machines. |
The plot has thickened. tl;dr the MacOSX10.15 SDK is buggy. I pushed a build to CI introducing
No references to the MacOSX platform SDK outside of /Applications/Xcode_12.1.1.app. That's good, as according to https://github.com/actions/virtual-environments/blob/main/images/macos/macos-10.15-Readme.md#installed-sdks the Xcode 12.1.1 install should only have the 10.15 SDK. I confirmed this by pushing a minimal GitHub Action to inspect the filesystem:
But, I pushed a separate custom GitHub Action job that effectively does
(This is partial output because Here are just the MacOSX SDKs with that function declaration:
We also see a reference to the symbol in the Xcode 12.1.1 / SDK 10.15 libSystem.tbd:
I ran If you look at So as far as I can tell Apple shipped buggy MacOSX 10.15 SDKs by failing to add API availability annotations to So it appears the 10.15 SDK is (forever?) buggy and isn't safe for some use cases, including PyOxidizer's. I'll have to think about next steps here. I really want to have weakly referenced symbols in the Python distributions because Python and other dependencies can take advantage of modern Apple APIs (this can have performance implications). But I recognize not everybody has a modern Xcode / SDK sitting around and this could make PyOxidizer harder to use. There's also the issue where @gi0baro couldn't get things to link against the 11.x SDK. But I'm optimistic that's either due to the bugged SDK (I'm unsure if the API availability guards make it into mach-o or what) or not passing the right linker arguments to actually pull in the 11.x SDK frameworks. More investigation is still needed. |
@gi0baro could you please run the following and let me know if you have any success with any of them or if the output looks interesting:
While I'm here, weak references in mach-o files appear to only have their associated library defined in a two-level namespace image indicated by MH_TWOLEVEL in the mach-o header. Concretely, it appears the Now, if you can tell Clang which library each weak symbol is in without Clang insisting on finding it itself (perhaps you can do this with a linker script?), PyOxidizer could parse |
Another potential idea is to generate a custom library defining all the weak symbols, have Rust link against that to appease the linker symbol requirements, then re-write the resulting mach-o file so the weak references refer to the appropriate library. (If the binary already has |
See the inline comment and indygreg/PyOxidizer#373 for more context. The goal of this change is to get rid of the reference to __darwin_check_fd_set_overflow to restore working Python distributions for PyOxidizer. This is far from a long-term solution as we need to handle weak symbols more robustly for Python 3.9+. But it seems appropriate for Python 3.8 builds, as the introduction of this symbol was a "regression" when we transitioned to GitHub Actions and apparently introduced the buggy SDK.
can't use
while
|
I just realized
And can you also try throwing |
See the inline comment and indygreg/PyOxidizer#373 for more context. The goal of this change is to get rid of the reference to __darwin_check_fd_set_overflow to restore working Python distributions for PyOxidizer. This is far from a long-term solution as we need to handle weak symbols more robustly for Python 3.9+. But it seems appropriate for Python 3.8 builds, as the introduction of this symbol was a "regression" when we transitioned to GitHub Actions and apparently introduced the buggy SDK.
@indygreg it's just the same, the only difference is that for several other packages I get the error
for several files in the SDK. Seems pretty clear that SDK 11.1 is not compatible with MacOS 10.14 (which kinda make sense to me). |
Yeah, it looks like the older linker isn't able to read the Relatedly, I wonder what the compatibility story is for Clang versions and Anyway, I updated python-build-standalone to build Python 3.8 with an older, non-bugged version of the 10.15 SDK. The |
Linking _ctypes also requires the macOS 11 SDK if you want to run ctypes on macOS 11. MacOS 11 introduced a shared library "cache" that contains all system shared libraries, _ctypes uses a function introduced in macOS 11 to check if a library is present in that cache. This affects both arm64 and x86_64. |
OK, having researched this a bit more, Apple SDKs use Symbol exports are often just the string name of the symbol. But they can also include special syntax to indicate a weakly referenced symbol and the minimal SDK version that is supported. e.g. The YAML-based The main implication for PyOxidizer is that end-user machines will only be able to link Python distributions if their Clang/linker is able to read We probably want to start annotating SDK version dependencies in the Python distributions. We almost certainly want to be validating minimum SDK / toolchain requirements in PyOxidizer when building so end-users avoid cryptic linker errors. There's still an open question on what the minimum version requirements should be. I'd really like to keep the Python distributions modern. But would requiring a 10.15 or 11.0 SDK / Xcode / Developer command line tools effectively prevent users on older macOS versions from using PyOxidizer? (I haven't kept up on the state of what it is like to run an older macOS version. I know people do it. But I have no clue if Apple supports newer versions of developer tools on older macOS releases.) |
Oh, interestingly the definition of the |
I have committed a I have also committed a Did I need to write the A potential mad scientist idea we could explore is generating custom |
indygreg/PyOxidizer#373 has revealed issues relinking the Python distributions on older Apple SDKs and Clang toolchains. In order to help downstream consumers validate they are using a sufficiently modern SDK and toolchain, this commit adds metadata to the Python distribution to advertise information about the Apple SDK used to build and its targeting settings. As part of this, we teach the Rust validation code to ensure fields are present on Apple triples.
indygreg/PyOxidizer#373 has revealed issues relinking the Python distributions on older Apple SDKs and Clang toolchains. In order to help downstream consumers validate they are using a sufficiently modern SDK and toolchain, this commit adds metadata to the Python distribution to advertise information about the Apple SDK used to build and its targeting settings. As part of this, we teach the Rust validation code to ensure fields are present on Apple triples.
indygreg/PyOxidizer#373 has revealed issues relinking the Python distributions on older Apple SDKs and Clang toolchains. In order to help downstream consumers validate they are using a sufficiently modern SDK and toolchain, this commit adds metadata to the Python distribution to advertise information about the Apple SDK used to build and its targeting settings. As part of this, we teach the Rust validation code to ensure fields are present on Apple triples.
The Python distributions now advertise Apple SDK metadata. I have some commits queued up to teach PyOxidizer to use this metadata to automatically locate, validate, and use a compatible Apple SDK. This means we'll be imposing the requirement of the 10.15 SDK for Python 3.8 and the 11.0 SDK for Python 3.9. And the modern SDKs are incompatible with sufficiently old Clang versions. So this effectively bumps the minimum required Clang on Apple platforms. (Although it is unclear which Apple Clang versions support the newer TBD file formats - I may have to download a bunch of old Xcode Commandline Tools packages to find out.) PyOxidizer will respect the I think this solution strikes the appropriate balance between build correctness, build success rate, and end-user experience. I anticipate the only people negatively impacted by this will be those running older macOS versions. While sympathetic to that population, that is a vanishingly small segment of the developer user segment. We can accommodate affected users through documentation and actionable error messages, which I'll try to do. |
These patches were added to avoid introducing symbols from modern macOS SDKs because of linking issues when using an older SDK. With the investigation in indygreg/PyOxidizer#373 and the conclusion to require a minimum SDK version to relink, it should be safe to introduce these symbols back into the binaries. I'm pretty certain that the symbol mentioned inline is already being pulled in by CPython 3.9 anyway. So these patches were arguably already out of date before the aforementioned policy change.
@indygreg just a heads up, the one place this might impact people is those who run PyOxidizer on CircleCI (or any CI service), as they don't yet support macOS images with 11.0, so we're stuck on 10.15. We want to keep building with Python 3.9, so for now we're working around this by locking the PyOxidizer version to |
Are you still having trouble with the latest release (0.13.2)? I thought I had sorted out all these macOS issues. The CI for PyOxidizer itself is working just fine on GitHub Actions on 10.15. The only hurdle should be ensuring you have access to a new-enough SDK to build against. But it needed, I can teach PyOxidizer to search for the SDK in non-standard directories when certain CI environment variables are defined to make it more turnkey. Perhaps a new issue should be opened documenting your exact problems on 0.13.2? |
@indygreg ah, I didn't realize it was possible to just pull in the 11.0 SDK on macOS 10.15. Makes sense in hindsight. Are there docs on this somewhere? No worries if not, I'll figure it out! Thanks for the pointer :) |
Usually CI platforms have ~all versions of Apple SDKs available somewhere. e.g. GitHub Actions: https://github.com/actions/virtual-environments/blob/main/images/macos/macos-10.15-Readme.md#xcode PyOxidizer's docs for Apple SDK targeting are at https://pyoxidizer.readthedocs.io/en/stable/pyoxidizer_distributing_macos.html#pyoxidizer-distributing-macos-build-machine-requirements |
The bane of weak symbols on macOS has come back to haunt us. (See indygreg/PyOxidizer#373 for previous battles.) In #122 we tracked down a runtime failure to the fact that CPython 3.8 didn't properly backport weak symbol handling support. So, if you build with a modern SDK targeting an older SDK (which we do as of 63f13fb), the linker will insert a weak symbol. However, CPython doesn't have the runtime guards and will attempt to dereference it, causing a crash. Up to this point, our strategy for handling this mess was to stop using symbols on all Python versions when we found one to be causing an issue. This was crude, but effective. In recent commits, we implemented support for leveraging the macOS SDK .tbd files for validating symbol presence. We can now cross reference undefined symbols in our binaries against what the SDKs tell us is present and screen for missing symbols. This helps us detect strong symbols that aren't present on targeted SDK versions. For weak symbols, I'm not sure if we can statically analyze the Mach-O to determine if a symbol is guarded. I _think_ the guard is a compiler built-in and gets converted to a function call, or maybe inline assembly. We _might_ have to disassemble if we wanted to catch unguarded weakly referenced symbols. Yeah, no. In this commit, we effectively change our strategy for weak symbol handling. Knowing that CPython 3.9+ should have guarded weak symbols everywhere, we only ban symbol use on CPython 3.8, specifically x86-64 3.8 since the aarch64 build targets macOS SDK 11, which has the symbols we need. We also remove the one-off validation check for 2 banned symbols. In its place we add validation that only a specific allow list of weak symbols is present on CPython 3.8 builds. As part of developing this, I found yet more bugs in other programs. CPython had some pragmas forcing symbols to be weak but the pragmas weren't protected by an #if guard. This caused a compiler failure if we prevented the symbols from being defined. libffi was also using mkostemp without runtime guards. I'm unsure if Python would ever call into a function that would attempt to resolve this symbol. But if it does it would crash on 10.9. So we disable that symbol for builds targeting 10.9.
The bane of weak symbols on macOS has come back to haunt us. (See indygreg/PyOxidizer#373 for previous battles.) In #122 we tracked down a runtime failure to the fact that CPython 3.8 didn't properly backport weak symbol handling support. So, if you build with a modern SDK targeting an older SDK (which we do as of 63f13fb), the linker will insert a weak symbol. However, CPython doesn't have the runtime guards and will attempt to dereference it, causing a crash. Up to this point, our strategy for handling this mess was to stop using symbols on all Python versions when we found one to be causing an issue. This was crude, but effective. In recent commits, we implemented support for leveraging the macOS SDK .tbd files for validating symbol presence. We can now cross reference undefined symbols in our binaries against what the SDKs tell us is present and screen for missing symbols. This helps us detect strong symbols that aren't present on targeted SDK versions. For weak symbols, I'm not sure if we can statically analyze the Mach-O to determine if a symbol is guarded. I _think_ the guard is a compiler built-in and gets converted to a function call, or maybe inline assembly. We _might_ have to disassemble if we wanted to catch unguarded weakly referenced symbols. Yeah, no. In this commit, we effectively change our strategy for weak symbol handling. Knowing that CPython 3.9+ should have guarded weak symbols everywhere, we only ban symbol use on CPython 3.8, specifically x86-64 3.8 since the aarch64 build targets macOS SDK 11, which has the symbols we need. We also remove the one-off validation check for 2 banned symbols. In its place we add validation that only a specific allow list of weak symbols is present on CPython 3.8 builds. As part of developing this, I found yet more bugs in other programs. CPython had some pragmas forcing symbols to be weak but the pragmas weren't protected by an #if guard. This caused a compiler failure if we prevented the symbols from being defined. libffi was also using mkostemp without runtime guards. I'm unsure if Python would ever call into a function that would attempt to resolve this symbol. But if it does it would crash on 10.9. So we disable that symbol for builds targeting 10.9.
The bane of weak symbols on macOS has come back to haunt us. (See indygreg/PyOxidizer#373 for previous battles.) In #122 we tracked down a runtime failure to the fact that CPython 3.8 didn't properly backport weak symbol handling support. So, if you build with a modern SDK targeting an older SDK (which we do as of 63f13fb), the linker will insert a weak symbol. However, CPython doesn't have the runtime guards and will attempt to dereference it, causing a crash. Up to this point, our strategy for handling this mess was to stop using symbols on all Python versions when we found one to be causing an issue. This was crude, but effective. In recent commits, we implemented support for leveraging the macOS SDK .tbd files for validating symbol presence. We can now cross reference undefined symbols in our binaries against what the SDKs tell us is present and screen for missing symbols. This helps us detect strong symbols that aren't present on targeted SDK versions. For weak symbols, I'm not sure if we can statically analyze the Mach-O to determine if a symbol is guarded. I _think_ the guard is a compiler built-in and gets converted to a function call, or maybe inline assembly. We _might_ have to disassemble if we wanted to catch unguarded weakly referenced symbols. Yeah, no. In this commit, we effectively change our strategy for weak symbol handling. Knowing that CPython 3.9+ should have guarded weak symbols everywhere, we only ban symbol use on CPython 3.8, specifically x86-64 3.8 since the aarch64 build targets macOS SDK 11, which has the symbols we need. We also remove the one-off validation check for 2 banned symbols. In its place we add validation that only a specific allow list of weak symbols is present on CPython 3.8 builds. As part of developing this, I discovered that libffi was also using mkostemp without runtime guards. I'm unsure if Python would ever call into a function that would attempt to resolve this symbol. But if it does it would crash on 10.9. So we disable that symbol for builds targeting 10.9.
The bane of weak symbols on macOS has come back to haunt us. (See indygreg/PyOxidizer#373 for previous battles.) In #122 we tracked down a runtime failure to the fact that CPython 3.8 didn't properly backport weak symbol handling support. So, if you build with a modern SDK targeting an older SDK (which we do as of 63f13fb), the linker will insert a weak symbol. However, CPython doesn't have the runtime guards and will attempt to dereference it, causing a crash. Up to this point, our strategy for handling this mess was to stop using symbols on all Python versions when we found one to be causing an issue. This was crude, but effective. In recent commits, we implemented support for leveraging the macOS SDK .tbd files for validating symbol presence. We can now cross reference undefined symbols in our binaries against what the SDKs tell us is present and screen for missing symbols. This helps us detect strong symbols that aren't present on targeted SDK versions. For weak symbols, I'm not sure if we can statically analyze the Mach-O to determine if a symbol is guarded. I _think_ the guard is a compiler built-in and gets converted to a function call, or maybe inline assembly. We _might_ have to disassemble if we wanted to catch unguarded weakly referenced symbols. Yeah, no. In this commit, we effectively change our strategy for weak symbol handling. Knowing that CPython 3.9+ should have guarded weak symbols everywhere, we only ban symbol use on CPython 3.8, specifically x86-64 3.8 since the aarch64 build targets macOS SDK 11, which has the symbols we need. We also remove the one-off validation check for 2 banned symbols. In its place we add validation that only a specific allow list of weak symbols is present on CPython 3.8 builds. As part of developing this, I discovered that libffi was also using mkostemp without runtime guards. I'm unsure if Python would ever call into a function that would attempt to resolve this symbol. But if it does it would crash on 10.9. So we disable that symbol for builds targeting 10.9.
It seems that latest release broke building targets on MacOS.
Context:
The build target succeed with release 0.10.3.
Output (truncated):
I'm not able to test this on MacOS 11, it might be working.
Also, same target on linux works.
The text was updated successfully, but these errors were encountered: