Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOC generates GCN instructions, HSAILasm cannot assemble them #15

Open
syifan opened this issue May 10, 2016 · 7 comments
Open

CLOC generates GCN instructions, HSAILasm cannot assemble them #15

syifan opened this issue May 10, 2016 · 7 comments

Comments

@syifan
Copy link

syifan commented May 10, 2016

When I was compiling some program with snackhsail.sh (I do not know when it changed to snackhsail, the old snack.sh does not work properly. Is AMD giving up HSAIL?), I notice some instruction is not compatible with HSA tool chains. I got the following error:

>       gcn_min_f32     $s5, $s9, $s5;
>              ^
input(124,9): Undefined instruction

and

>       gcn_divrelaxed_ftz_f32  $s4, $s5, $s4;
>                         ^
input(353,20): Undefined instruction

How does gcn_min different the min instruction in HSAIL and how does the gcn_divrelaxed different from the div instruction in HSAIL? Should the CLOC compile the HSAIL program or only compiles for AMD GCN devices? How can another vendor use CLOC or even HSAIL if it generates GCN specific instructions?

@dpreobrazhensky
Copy link

HSAIL Tools have been augmented with support of vendor-specific extensions. To compile the code in question you have to specify extension name and use amd_gcn instead of gcn. For example:

module &module:1:0:$full:$large:$default;
extension "amd:gcn";
function &TestFunc()()
{
    amd_gcn_min_f32     $s5, $s9, $s5;
};

GCN instructions differ from standard in that they are not IEEE-compliant.
Regarding your other questions: I do not know.

@syifan
Copy link
Author

syifan commented May 10, 2016

@dpreobrazhensky Thanks for your reply. How can I change that instruction name if I am compiling from OpenCL code?

@dpreobrazhensky
Copy link

Do you have the latest toolchain? I'm not sure, but I believe it should have been updated accordingly. This feature (vendor-specific extensions) was committed half a year ago.

@syifan
Copy link
Author

syifan commented May 10, 2016

@dpreobrazhensky We have been using CLOC for a very long time. Only since the most recent version (after the apt-get package is available), we had this problem. I knew that the gcn instructions are generated long time ago, but the compiled program works fine on Kaveri machines. After we updated CLOC, out program cannot get compiled.

@gregrodgers
Copy link

Sun,
I am very sorry. This is my fault. The problem occurred when I switched cloc.sh to use a newer version of HSAILasm. Since cloc.sh uses the old HLC 3.2 code generator for -hsail and -brig options, you do not get the amd prefix. I don't have a plan to move to a newer HLC for HSAIL generation. Most of my focus has been on he generation of HSA code object which uses a different front end and backend. I just promoted fixes to both amdcloc package and the hlc3.2 package to fix this problem by going back to the old HSAILasm in HLC3.2. A few other fixes have gone into amdcloc1.0-11. The rocm package server will have these new packages very soon. Please "apt-get upgrade" when you see them. I will upload them to the git repo if it takes too long to get them through the package server.

@gregrodgers
Copy link

ok, just the two debs you need are on the git repo in packages/ubuntu. Please use these only if you are generating hsail or brig. The code object path needs an update to amdllvm which is too big to put on the git repo. That will come later in the week from the package server.

@harald-lang
Copy link

Hi Greg,

the latest CLOC version marks the -brig and -hsail options as '(soon to be deprecated)'. Is AMD giving up on HSAIL?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants