Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrappable Builds #8929

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Bootstrappable Builds #8929

wants to merge 1 commit into from

Conversation

tobtoht
Copy link
Contributor

@tobtoht tobtoht commented Jul 2, 2023

This PR proposes to replace Gitian with Guix to achieve bootstrappable builds for release binaries.

What?

https://youtube.com/watch?v=I2iShmUTEl8

If you have 15 minutes, please consider watching this presentation by Carl Dong linked above as it covers the migration (Gitian → Guix) this PR proposes to make. It explains why reproducibility alone is not enough.

Other resources:

Why?

  • Greater build system security: Bootstrappability allows us to audit and reproduce our toolchain instead of blindly trusting binary downloads. Our build environment can be built from source, all the way down. It allows us to reduce our supply chain attack surface by only including the packages that we need, and nothing else.

  • Easier to set up: Guix runs on any Linux distribution and on most architectures (x86_64, aarch64, riscv64). To produce reproducible release binaries, you only need to install Guix and run the build script. This hopefully leads to more participants in the pre-release verified reproduction process.

  • More flexibility: Unlike Gitian, we are not limited to the package set of a particular Ubuntu version. Guix allows us to pick and choose our toolchains. This allows us to use the latest compilers while targeting older versions of glibc and would, for example, make it easier to add a modern Rust compiler to our build environment. Packages that are not available in Guix can easily be defined in the manifest or upstreamed.

  • Better developer UX: Guix allows us to modify any detail about our build environment with ease. Debugging build issues takes less time because we have shell access to the build environment. The monero source code is bind mounted into the container, so edits to package definitions can be tested incrementally.

  • Future proof: Guix is actively developed, Gitian is in maintenance mode (serious bug fixes only).

How to test?

Requirements:

  • any Linux distribution
  • 50 GB of free disk space
  • 4 or more cores recommended
  • 2 GB RAM per thread
# Install guix
$ apt install guix  # ubuntu 22.04, debian 11, or later 
or 
# use the install script: https://guix.gnu.org/manual/en/html_node/Binary-Installation.html

# Clone the repo
$ git clone https://github.com/tobtoht/monero
$ cd monero
$ git checkout guix

# Run the builds
$ ./contrib/guix/guix-build  # this will take several hours, uses all available cores
or
$ JOBS=N ./contrib/guix/guix-build # replace N with the desired maximum number of threads

# Obtain a list of hashes
$ uname --machine && find guix/guix-build-$(git rev-parse --short=12 HEAD)/output/ -type f -print0 | env LC_ALL=C sort -z | xargs -r0 sha256sum

For more information, see README.md

Notes

Environment Guix version (NEW) Gitian version (OLD)
Guix 53396a22af (28 Aug 2024) n/a
Glibc 2.27 2.27
Binutils 2.38 2.30
GCC 12.4.0 7.5.0
Clang 11.1.0 (darwin, freebsd) 9.0.0
Linux headers 6.1.106 (LTS)* 5.4.0

* this does not affect run-time kernel requirements, see glibc docs.

The guix.yml workflow does not use caching because it uses too much space and causes evictions for other caches (10 GB limit). We only need to run it when there is a change to contrib/{depends,guix}, which happens infrequently and doesn't need to complete quickly.

This PR removes native_clang because Guix provides us with a bootstrapped package and non-guix builds can use the system clang. This PR introduces backwards incompatible compiler flags, bumping the minimum Clang version to 10 (up from 9) for depends builds. Clang >=10 is available in the package managers of all supported distros (including Ubuntu 18.04).

The minimum glibc version for all Linux targets is now 2.27 (Ubuntu 18.04, Debian 10), for more details see #9171 (comment).

The source archive now includes all submodules.

To-do

  • All binaries tested

Q&A

What role does Guix serve in the build process?

Guix is used to set-up a reproducible, bootstrappable, containerized build environment. This replaces the Ubuntu 18.04 based environment we used for Gitian builds.

Monero dependencies are built using the depends build system inside the container provided by Guix. While Guix could eventually replace depends, it would be too drastic for this PR.

Can Guix run on any Linux distribution?

Yes, with few exceptions. Installing Guix on an immutable distro like Fedora SilverBlue might not be possible.

To install Guix, use the official install script: https://guix.gnu.org/manual/en/html_node/Binary-Installation.html

Can Guix run on macOS or Windows?

Not natively. You can run Guix builds on macOS inside a virtual machine (even on Apple Silicon). On Windows, it should be possible to install Guix on WSL2.

Can Guix run on aarch64 / riscv64 machines?

Yes. It even produces the same bit-identical release binaries as Guix run on a x86_64 machine. This isn't possible with Gitian.

Targets that don't currently build on non-x86_64: Android, FreeBSD.

Are Guix builds slower than Gitian?

Guix builds are only slower the first time, because it needs to build toolchains (or the entire package graph if --no-substitutes is used) from source. Subsequent runs just as fast (or faster) than Gitian due to caching.

Why Guix instead of Nix?

Nix packages are not bootstrappable.

What does bootstrappable mean in this context?

It means that the package graph for all packages included in our build environment is rooted in a single 357-byte heavily annotated program. We no longer have to trust binaries (downloaded from Ubuntu servers) for reproducible builds.

For more information: https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/

If a new version of Guix is released, does that mean our build environment is no longer reproducible?

No, builds remain reproducible indefinitely, assuming the source code for all packages is archived and remains available. The exact version of Guix, including all of its packages is pinned using time-machine.

More information:

How do we define which packages are included in the build environment?

In manifest.scm. In this file, we can choose packages that are already available in Guix or define packages ourselves.

When should we define packages in the manifest?

If a package, specific package version, or toolchain isn't available in Guix and it isn't possible to upstream or we can't we can't wait on that.

Is the build environment identical for all targets?

No, extra packages can be added for each target.

If we change the time-machine commit, does that mean our build environment changes?

Yes. The manifest generally does not specify which packages versions are included in the build environment. If a package was updated in Guix between the old and new commit, we will have the updated version in our build environment. Updating the time-machine can therefore cause breakage and should be done with caution.

How do I get shell access to the build environment for development purposes?

In contrib/guix/guix-build replace bash -c "cd /monero && bash contrib/guix/libexec/build.sh" with bash.

Then invoke with: FORCE_DIRTY_WORKTREE=1 ./contrib/guix/guix-build.

Where to start review?

contrib/guix/README.md

contrib/guix/guix-build

contrib/guix/manifest.scm

contrib/guix/libexec/build.sh

Troubleshooting

  • Might segfault during compilation on first generation Ryzen processors due to a hardware fault.
  • Ubuntu 24.04: If you run into mount "none" on "/tmp/guix-directory.": Permission denied you need to create an apparmor profile for Guix, see: Bootstrappable Builds #8929 (comment)

Status

This PR is ready for testing and review.

Target Builds Tested
x86_64-linux-gnu 🮱 🮱
arm-linux-gnueabihf 🮱
aarch64-linux-gnu 🮱 🮱
riscv64-linux-gnu 🮱 🮱
i686-linux-gnu 🮱 🮱
x86_64-w64-mingw32 🮱 🮱
x86_64-unknown-freebsd 🮱 🮱
x86_64-apple-darwin 🮱
aarch64-apple-darwin 🮱 🮱
arm-linux-androideabi 🮱
aarch64-linux-android 🮱 🮱

Follow-up PRs

@0xFFFC0000
Copy link
Collaborator

I cannot wait until we finalize and merge this. Thanks for your great work @tobtoht

@tobtoht
Copy link
Contributor Author

tobtoht commented Feb 25, 2024

I don't expect to make major changes to this PR, it's ready for testing. Reviews on any of the sub-PRs would help move this forward.

contrib/depends/Makefile Outdated Show resolved Hide resolved
#!/bin/bash
if [ "$1" != "-cc1" ]; then
- `dirname $0`/clang{version} {flags} "$@"
+ env -u C_INCLUDE_PATH -u CPLUS_INCLUDE_PATH -u OBJC_INCLUDE_PATH -u OBJCPLUS_INCLUDE_PATH -u CPATH -u LIBRARY_PATH `dirname $0`/clang{version} {flags} "$@"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to unset these environment variables here because they refer to Guix profile packages. We want clang to only include headers from the NDK. We can't unset them in build.sh because native_ package builds would fail to find the necessary include headers. Unsetting these variables does not affect non-Guix builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch will be removed in #9456

@@ -0,0 +1,22 @@
Without ffile-prefix-map, the debug symbols will contain paths for the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building on aarch64 should produce bit-for-bit identical release binaries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now does for linux, windows, and macos targets. FreeBSD and Android targets use pre-compiled x86_64 toolchains and libraries.

contrib/guix/manifest.scm Outdated Show resolved Hide resolved
contrib/guix/libexec/build.sh Show resolved Hide resolved
contrib/guix/libexec/build.sh Show resolved Hide resolved
contrib/gitian/docker/
contrib/gitian/sigs/
# guix
/guix
Copy link
Contributor Author

@tobtoht tobtoht Feb 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished builds are available in ./guix/guix-build-<COMMIT>/

├── guix-build-222ea1a30058
│   ├── build
│   │   └── distsrc-222ea1a30058-x86_64-linux-gnu  # build directory
│   ├── logs
│   │   └── x86_64-linux-gnu  # various logs for reproducibility debugging
│   │       ├── depends-hashes.txt
│   │       ├── depends-packages.txt
│   │       ├── guix-env.txt
│   │       ├── guix-hashes.txt
│   │       └── SHA256SUMS.part
│   ├── output
│   │   └── x86_64-linux-gnu
│   │       └── monero-x86_64-linux-gnu-222ea1a30058.tar.bz2  # binary tarball
│   └── var
│       ├── precious_dirs  # directories to keep when running `./contrib/guix/guix-clean`
│       └── profiles

@@ -267,4 +275,4 @@ $(foreach package,$(all_packages),$(eval $(call int_config_attach_build_config,$
$(foreach package,$(all_packages),$(eval $(call int_add_cmds,$(package))))

#special exception: if a toolchain package exists, all non-native packages depend on it
$(foreach package,$(packages),$(eval $($(package)_unpacked): |$($($(host_arch)_$(host_os)_native_toolchain)_cached) ))
$(foreach package,$(packages),$(eval $($(package)_extracted): |$($($(host_arch)_$(host_os)_native_toolchain)_cached) ))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_unpacked target doesn't exist.

contrib/guix/manifest.scm Outdated Show resolved Hide resolved
contrib/guix/libexec/build.sh Show resolved Hide resolved
Comment on lines +3 to +22
$(1)_cc=$$($$($(1)_type)_CC)
$(1)_cxx=$$($$($(1)_type)_CXX)
$(1)_objc=$$($$($(1)_type)_OBJC)
$(1)_objcxx=$$($$($(1)_type)_OBJCXX)
$(1)_ar=$$($$($(1)_type)_AR)
$(1)_ranlib=$$($$($(1)_type)_RANLIB)
$(1)_libtool=$$($$($(1)_type)_LIBTOOL)
$(1)_nm=$$($$($(1)_type)_NM)
$(1)_cflags=$$($$($(1)_type)_CFLAGS) \
$$($$($(1)_type)_$$(release_type)_CFLAGS)
$(1)_cxxflags=$$($$($(1)_type)_CXXFLAGS) \
$$($$($(1)_type)_$$(release_type)_CXXFLAGS)
$(1)_arflags=$$($$($(1)_type)_ARFLAGS) \
$$($$($(1)_type)_$(release_type)_ARFLAGS)
$(1)_ldflags=$$($$($(1)_type)_LDFLAGS) \
$$($$($(1)_type)_$$(release_type)_LDFLAGS) \
-L$$($($(1)_type)_prefix)/lib
$(1)_cppflags=$$($$($(1)_type)_CPPFLAGS) \
$$($$($(1)_type)_$$(release_type)_CPPFLAGS) \
-I$$($$($(1)_type)_prefix)/include
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delay expansion of package variables.

@@ -0,0 +1,63 @@
name: ci/gh-actions/guix

on: [push, pull_request]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed guix and depends package caching from the CI workflow because it uses too much space and causes evictions for other caches (10 GB limit). We only need to run it when there is a change to contrib/{depends,guix}, which happens infrequently and doesn't need to complete quickly.

on: [push, pull_request] is temporary for testing.

path: contrib/depends/sources
key: sources-${{ hashFiles('contrib/depends/packages/*') }}
- name: install dependencies
run: sudo apt update; sudo apt -y install guix git ca-certificates
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth investigating if using the installer script is faster.

https://guix.gnu.org/manual/en/html_node/Installation.html

(("-rpath=") "-rpath-link="))
#t))))))))

(define-public glibc-2.27
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minimum glibc version for all Linux targets is now 2.27 (Ubuntu 18.04, Debian 10).

For more context, see: #9171 (comment)

contrib/guix/libexec/build.sh Outdated Show resolved Hide resolved
contrib/depends/packages/darwin_sdk.mk Show resolved Hide resolved
Comment on lines -81 to +84
SET(CMAKE_C_COMPILER @prefix@/native/bin/clang)
SET(CMAKE_C_COMPILER @CC@)
SET(CMAKE_C_COMPILER_TARGET ${CLANG_TARGET})
SET(CMAKE_C_FLAGS_INIT -B${_CMAKE_TOOLCHAIN_PREFIX})
SET(CMAKE_CXX_COMPILER @prefix@/native/bin/clang++ -stdlib=libc++)
SET(CMAKE_CXX_COMPILER @CXX@ -stdlib=libc++)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now using the system / guix-provided clang for darwin builds.

Comment on lines +87 to +88
SET(CMAKE_ASM_COMPILER clang)
SET(CMAKE_ASM-ATT_COMPILER as)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake automatically reduces CMAKE_CXX_COMPILER to the first command to derive CMAKE_ASM_COMPILER. This happens to be env instead of clang, so we need to set it explicitly.

contrib/guix/libexec/build.sh Show resolved Hide resolved
Comment on lines -49 to +57
final_build_id_long+=$($(package)_build_id_long)
final_build_id_long+=:[recipe]:$(1)-$($(1)_version)-$($(1)_recipe_hash)-$(release_type):[deps]$(foreach dep,$($(1)_build_id_deps),$(shell echo ":$(dep)")):[$($(1)_type)_id]:$($($(1)_type)_id_string):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase verbosity of final_build_id_long to ease debugging reproducibility issues.

Comment on lines +103 to +104
build_id_string:=$(realpath $(GUIX_ENVIRONMENT))
$(host_arch)_$(host_os)_id_string:=$(realpath $(GUIX_ENVIRONMENT))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any change to manifest.scm invalidates the cache.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i686-linux-gnu Result:

Compiled with guix for HOST=i686-linux-gnu.
Unit tests ran without issue on i686-linux-gnu.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aarch64-linux-gnu Result:

Compiled with guix for HOST=aarch64-linux-gnu.
Unit tests ran without issue on aarch64-linux-gnu.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x86_64-unknown-freebsd Result:

Compiled with guix for HOST=x86_64-unknown-freebsd.
Unit tests ran without issue on x86_64-unknown-freebsd.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aarch64-apple-darwin Result:

Compiled with guix for HOST=aarch64-apple-darwin .
Unit tests ran without issue on aarch64-apple-darwin.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.

build-macos [1] and build-windows [2] failures are unrelated and has been addressed in master.

  1. https://github.com/monero-project/monero/actions/runs/10662553939/job/29549974869?pr=8929
  2. https://github.com/monero-project/monero/actions/runs/10662553939/job/29549975090?pr=8929

@tobtoht
Copy link
Contributor Author

tobtoht commented Oct 23, 2024

@iamamyth Are you perchance capable and willing to give this PR a full review when time permits?

This PR is the foundation for #9440, which will allow us to ship FCMP++, and about a dozen other PRs. It is critical that build system changes are merged ahead of time to avoid unnecessary delays in the hardfork timeline.

@iamamyth
Copy link

@tobtoht I think I'll have time to review this within the next week (i.e. 7 days). I've already been fiddling with the build system on a local branch (have code ready to automate make job count, but don't want to PR it because there's a huge backlog as-is) and inspecting other build system PRs, so I should have enough context.

echo "Looking for build output SHA256SUMS fragments in ${LOGDIR_BASE}"

shopt -s nullglob
sha256sum_fragments=( "$LOGDIR_BASE"/*/SHA256SUMS.part ) # This expands to an array of directories...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just using find here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sha256sum_fragments=($(find "$LOGDIR_BASE" -name "SHA256SUMS.part"))

would split on potential white spaces in the path. Is there a better way?

Copy link

@iamamyth iamamyth Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't get a chance to finish my thoughts on this file earlier: I think you can structure the pipeline a bit differently by doing separate traversals for each file category (adds a dir scan to hopefully simplify the logic). Then, you can use the find -print0 | xargs -0 pattern, and your error handling becomes a check on whether the outputs are empty; partial example:

find "$LOGDIR_BASE" -path "$LOGDIR_BASE/*-codesigned" -prune -o -name "SHA256SUMS.part" -print0 | xargs -0 sort -u /dev/null | basenameify_SHA256SUMS | sort -k2

Because nocodesigned fragments are mandatory, you end up with a pretty nice overall program flow:

  1. Process/validate nocodesigned fragments (error if empty file).
  2. Process/validate codesigned fragments (info if empty file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified the script based on your feedback. I also removed the {non,}codesigned logic, since I don't think we have plans to add codesigning for release binaries in the near future. Should that change, we can add the logic then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to adjust guix-verify based on these changes, will do that later.

@iamamyth
Copy link

Just an FYI, I'm ~70% done reviewing this PR. I'll post a summary comment when I complete my first pass review, but, so, far, everything looks good.

###########################

# CFLAGS
HOST_CFLAGS="-O2"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be clearer if you just had a * in the case statement and pushed these two lines into the case, where appropriate. Is the find command even helpful on non-linux/gnu hosts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned this up. We have logic to set the optimization level for Release builds inside CMakeLists.txt, so setting it here is redundant/unnecessary.

contrib/guix/guix-build Outdated Show resolved Hide resolved
#
under_dir() {
local path_residue
path_residue="${2##"${1}"}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is easy to get wrong and could use a test suite: For safety, you'd want to check that the child starts with "$parent/". Otherwise, as it stands, you get results such as:

$ under_dir /tmp /tmpx/config.py  
x/config.py

Also, while this function name is clear enough in the local context, if you end up making it a library function, I'd consider something such as path_contains, to emphasize that it's just testing path-based containment, not actual containment. As for testing, you could just put this function in its own file, then source it here and via the test (dumping it into one huge library has the disadvantage that, as the library grows, the set of false implicit dependencies grows, which makes scripts more bloated).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing: Normally it's helpful to, in addition to adding a "/" to parent, support the case where parent already contains one (pretty easy to do by e.g. dropping all trailing slashes and then adding one).

done
}

check_tools cat env readlink dirname basename git
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused as to the policy here: Other files also appear to check for a superset of these commands. It would make sense to me if the prelude checked for tools required by the prelude (realpath, cat, git), and each script checked for the tools it needs.

contrib/guix/guix-verify Outdated Show resolved Hide resolved
@iamamyth
Copy link

iamamyth commented Nov 3, 2024

Full review complete. The only concerning issue I saw was the under_dir logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants