You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the amdgpu target to rustc that allows to generate code for AMD GPUs.
The LLVM backend has good support for this backend. The goal is to expose this to Rust, enabling Rust as another language on these GPUs. The main target is compute capabilities. This is in contrast to the rust-gpu project, which targets graphics capabilities through spir-v (the graphics variant). The base runtime to run compute programs on AMD GPUs is HSA (Heterogeneous System Architecture), which is implemented in ROCR-Runtime. Therefore, the Rust backend should target the amdhsa OS. To support the target in rustc, LLVM needs to be compiled with the amdgpu backend enabled.
There are two points in which the amdgpu target is different from other (mainstream/x86) targets.
Address spaces
Address spaces can be thought of as denoting different physical memory areas (this is a thought concept, it can be that way in hardware, but it does not need to be). In LLVM IR, each pointer has an address space, defaulting to addrspace(0) (this is implicit in textual IR, which is why you won’t see it there). Different address spaces have different properties, e.g. they can have a different pointer size, the nullptr can be different (e.g. 0 vs -1) and the machine instructions used to access them can be different.
The amdgpu LLVM backend makes heavy use of address spaces. This is also a problem for other targets that want to support Rust (though mostly more exotic ones). The use of address spaces leads to situations, where a pointer in one address space needs to be casted to a pointer in a different address space. In LLVM IR, bitcast is invalid for this case, addrspacecast needs to be used.
The changes to rustc code should be mostly about fixing problems in a rather contained way. (I don’t know for sure how contained it will be, but compiling core required surprisingly few changes as can be seen in rust-lang/rust#134740; disclaimer: I only tried running a very simple program with -Zbuild-std=core so far.) The changes will bring the rust llvm backend closer to how LLVM envisions address spaces, which should make it easier to support other future targets that use more address spaces (there was already some work for other targets, which in turn made it easier for amdgpu).
0 (generic/flat): This is a “catch all”. Loads and stores to addrspace(0) may go to any of the below address spaces. The hardware switches at runtime, depending on the pointer. This works for all pointers, but is the slowest.
1 (global): This is basically VRAM (i.e. the most “normal” memory).
3 (local/LDS): This is different memory in hardware, basically a software-defined cache. Loads and stores are a lot faster than to VRAM, but not globally visible to all other threads. (Known as groupshared in HLSL or shared in GLSL.)
4 (constant): Same memory region as global (1), but guaranteed to be constant throughout the program. This allows using different instructions that are faster by going through a different cache.
5 (private/scratch): This is used for the stack, all allocas need to be in addrspace(5). A thread can only access its own private memory.
Basic support for the amdgpu target means using address spaces 1 (incoming pointers), 5 (allocas) and 0 (if we don’t know which one it is). Support for groupshared memory in the language requires its own RFC (something similar to thread_local probably makes sense).
Many processors / target-cpus
Every generation of GPUs uses different machine code (to some extent). LLVM supports them as different “cpu”s or processors (-Ctarget-cpu= argument for rustc).
There are two challenges for this regarding Rust support
A single processor needs to be the default in the target description.
If Rust would distribute compiled libraries (e.g. core), it would need to be for all processors to be useful.
There is no obvious choice for the default processor. If some processor is the default and a user tries to use the amdgpu target without overwriting the target-cpu, it likely results in an unusable binary. We could use a non-existing “cpu” as the default, resulting in compiler errors, to make users aware of the need to set a target-cpu.
E.g "please specify -Ctarget-cpu" results in failing compilations and warnings:
'please specify -Ctarget-cpu' is not a recognized processor for this target (ignoring processor)
Alternatively, some choice can be made, like gfx900.
Regarding 2., there may be a generic backend for amdgpu in the future, relying on spir-v (the compute variant) which would solve both these issues, but that is a long way to go and unsure if it eventually happens. For the reason of binary size alone, it does not make sense for Rust to distribute pre-compiled code for the amdgpu backend. Users should instead specify their processor via -Ctarget-cpu= and compile core via -Zbuild-std=core or similar means.
A compiler team member or contributor who is knowledgeable in the area can second by writing @rustbot second.
Finding a "second" suffices for internal changes. If however, you are proposing a new public-facing feature, such as a -C flag, then full team check-off is required.
Compiler team members can initiate a check-off via @rfcbot fcp merge on either the MCP or the PR.
Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.
The text was updated successfully, but these errors were encountered:
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.
Concerns or objections to the proposal should be discussed on Zulip and formally registered here by adding a comment with the following syntax:
@rfcbot concern reason-for-concern
<description of the concern>
Proposal
Add the
amdgpu
target to rustc that allows to generate code for AMD GPUs.The LLVM backend has good support for this backend. The goal is to expose this to Rust, enabling Rust as another language on these GPUs. The main target is compute capabilities. This is in contrast to the rust-gpu project, which targets graphics capabilities through spir-v (the graphics variant). The base runtime to run compute programs on AMD GPUs is HSA (Heterogeneous System Architecture), which is implemented in ROCR-Runtime. Therefore, the Rust backend should target the amdhsa OS. To support the target in rustc, LLVM needs to be compiled with the amdgpu backend enabled.
There are two points in which the amdgpu target is different from other (mainstream/x86) targets.
Address spaces
Address spaces can be thought of as denoting different physical memory areas (this is a thought concept, it can be that way in hardware, but it does not need to be). In LLVM IR, each pointer has an address space, defaulting to
addrspace(0)
(this is implicit in textual IR, which is why you won’t see it there). Different address spaces have different properties, e.g. they can have a different pointer size, the nullptr can be different (e.g.0
vs-1
) and the machine instructions used to access them can be different.The amdgpu LLVM backend makes heavy use of address spaces. This is also a problem for other targets that want to support Rust (though mostly more exotic ones). The use of address spaces leads to situations, where a pointer in one address space needs to be casted to a pointer in a different address space. In LLVM IR,
bitcast
is invalid for this case,addrspacecast
needs to be used.The changes to rustc code should be mostly about fixing problems in a rather contained way. (I don’t know for sure how contained it will be, but compiling
core
required surprisingly few changes as can be seen in rust-lang/rust#134740; disclaimer: I only tried running a very simple program with-Zbuild-std=core
so far.) The changes will bring the rust llvm backend closer to how LLVM envisions address spaces, which should make it easier to support other future targets that use more address spaces (there was already some work for other targets, which in turn made it easier for amdgpu).To get a feeling for what LLVM address spaces are used for, here is the list of the important amdgpu address spaces (from https://llvm.org/docs/AMDGPUUsage.html#address-spaces):
addrspace(0)
may go to any of the below address spaces. The hardware switches at runtime, depending on the pointer. This works for all pointers, but is the slowest.groupshared
in HLSL orshared
in GLSL.)alloca
s need to be inaddrspace(5)
. A thread can only access its own private memory.Basic support for the amdgpu target means using address spaces 1 (incoming pointers), 5 (allocas) and 0 (if we don’t know which one it is). Support for groupshared memory in the language requires its own RFC (something similar to
thread_local
probably makes sense).Many processors / target-cpus
Every generation of GPUs uses different machine code (to some extent). LLVM supports them as different “cpu”s or processors (
-Ctarget-cpu=
argument for rustc).There are two challenges for this regarding Rust support
core
), it would need to be for all processors to be useful.There is no obvious choice for the default processor. If some processor is the default and a user tries to use the amdgpu target without overwriting the
target-cpu
, it likely results in an unusable binary. We could use a non-existing “cpu” as the default, resulting in compiler errors, to make users aware of the need to set atarget-cpu
.E.g
"please specify -Ctarget-cpu"
results in failing compilations and warnings:Alternatively, some choice can be made, like
gfx900
.Regarding 2., there may be a generic backend for amdgpu in the future, relying on spir-v (the compute variant) which would solve both these issues, but that is a long way to go and unsure if it eventually happens. For the reason of binary size alone, it does not make sense for Rust to distribute pre-compiled code for the amdgpu backend. Users should instead specify their processor via
-Ctarget-cpu=
and compile core via-Zbuild-std=core
or similar means.The list of processors supported by LLVM is here: https://llvm.org/docs/AMDGPUUsage.html#processors
Related issues
Mentors or Reviewers
None yet, this is my first Rust contribution :)
Process
The main points of the Major Change Process are as follows:
@rustbot second
.-C flag
, then full team check-off is required.@rfcbot fcp merge
on either the MCP or the PR.You can read more about Major Change Proposals on forge.
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.
The text was updated successfully, but these errors were encountered: