Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CC toolchain to use in-tree builtin includes #23615

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kellyma2
Copy link

If the toolchain is composed in our build tree and we use an internal (relative) path for the builtin_sysroot that we pass to create_cc_toolchain_config_info, and we pass a value like %sysroot%/usr/include in cxx_builtin_include_directories then we will fail with missing dependency declarations.

Instead of only using the absolute paths for builtin includes, we can use all of them and do an additional matching check after we relativized the paths to the individual include items.

@github-actions github-actions bot added team-Rules-CPP Issues for C++ rules awaiting-review PR is awaiting review from an assigned reviewer labels Sep 13, 2024
If the toolchain is composed in our build tree and we use an
internal (relative) path for the `builtin_sysroot` that we pass to
`create_cc_toolchain_config_info`, and we pass a value like
`%sysroot%/usr/include` in `cxx_builtin_include_directories` then we
will fail with missing dependency declarations.

Instead of only using the absolute paths for builtin includes, we can
use all of them and do an additional matching check after we
relativized the paths to the individual include items.
@kellyma2
Copy link
Author

kellyma2 commented Oct 4, 2024

@lberki or @oquenchil any opinions on this?

@Wyverald Wyverald requested review from comius, trybka and pzembrod and removed request for lberki and oquenchil October 10, 2024 20:08
@pzembrod
Copy link
Contributor

Do I understand correctly: The key effect of this change is that all CcToolchain built_in_include_directories entries would be treated the same, regardless of whether they are absolute or relative paths?

@kellyma2
Copy link
Author

Yes, I believe so. In my case, specifically, I'm using a toolchain that we're constructing as part of our build, so the built-in includes show up in bazel-out

@trybka
Copy link
Contributor

trybka commented Oct 29, 2024

Sorry for the naïve question, but do you have a concrete example or sample toolchain? It seems quite odd to me to have output directories like bazel-out be present in include directories -- having outputs like this appear as inputs but with paths constructed outside of bazel seems potentially problematic, but perhaps I'm missing something simple. (Is this a two stage build? How are you deriving these paths?)

I am trying to dig into the code in question to better understand why this requirement exists in the first place...

@trybka
Copy link
Contributor

trybka commented Oct 29, 2024

I'm also wondering if something other than %sysroot% as a prefix would work better (e.g. %workspace%)

More info on the include dir prefixes here: https://cs.opensource.google/bazel/bazel/+/master:src/main/starlark/builtins_bzl/common/cc/cc_toolchain_provider_helper.bzl;l=103-122;drc=f2318d2cbf38d249d620dc99a11570843306a6fa

@kellyma2
Copy link
Author

Sorry for the naïve question, but do you have a concrete example or sample toolchain? It seems quite odd to me to have output directories like bazel-out be present in include directories -- having outputs like this appear as inputs but with paths constructed outside of bazel seems potentially problematic, but perhaps I'm missing something simple. (Is this a two stage build? How are you deriving these paths?)

tl;dr we're driving an internal hermetic toolchain that's derived from the contents of RPMs. It'll be hard to distill down into some thing trivial, but let me attempt to at least describe the details.

Longer version:

  1. We have a module extension that consumes the RPMs and synthesizes an internal repository for them, generates filegroup(), cc_library(), cc_import(), etc rules using the RPM inputs

  2. We have a separate module extension that can be used to generate the toolchain:

  • we define a sysroot using the RPM-import repository generated rules as inputs in the form of:
ext.sysroot(
   name="foo_sysroot",
   srcs = [
      ... a bunch of filegroups that specified the files from the RPMs ...
   ],
)

which ultimately synthesizes a repository to capture the sysroot

  • we define a toolchain where two of the things specified are (1) the sysroot, (2) the list of builtin include paths relative to the sysroot in the following form:
ext.toolchain(
  name = "bar_toolchain",
  sysroot = "@foo_sysroot//:sysroot",
  builtin_includes = ["%sysroot%/usr/include"],
)

The builtin_includes value is just getting plumbed directly into cxx_builtin_include_directories for create_cc_toolchain_config (via some indirection) and the path to @foo_sysroot//:sysroot gets shovelled in builtin_sysroot (which is ultimately what populates %sysroot%)

The /usr/include part of the builtin_includes is derived from the fact that we know from our RPM inputs that we stuck the headers that we're interested in into usr/include because we literally just explode the RPM file(s) into a tree somewhere. The relative path part of things comes from @foo_sysroot//:sysroot.

I am trying to dig into the code in question to better understand why this requirement exists in the first place...
I'm not clear on this either. The commit log didn't render me any particular insights and I'm still gaining familiarity with the bazel codebase.

@kellyma2
Copy link
Author

I'm also wondering if something other than %sysroot% as a prefix would work better (e.g. %workspace%)

The issue appears to be that it doesn't like relative paths which %workspace% is also going to give us.

@comius
Copy link
Contributor

comius commented Oct 30, 2024

cc @armandomontanez @matts1

@armandomontanez
Copy link

Backing up a bit to the most likely culprit just to double check: have you passed -no-canonical-prefixes (gcc/clang) and -fno-canonical-system-headers (gcc only) as compile flags in your toolchain, also making sure the headers are properly added to all_files and compiler_files? Once you've done that, you usually only need to allowlist truly absolute paths (e.g. /usr/include).

@armandomontanez
Copy link

Separately, I need to stare at this change longer to determine whether or not it's correct but at first glance I think it is.
Before landing this, I'd like to stamp out a minimal reproducer that we can check against (should be pretty easy with the new rule based toolchains). It just happens that in the vast majority of cases the root cause is actually forgetting to use those flags. Some compilers do not support or respect those -no-canonical-prefixes flags, so you do need an escape hatch like this to work with in-tree paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Rules-CPP Issues for C++ rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants