Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No perfcounters in the generated csv #13

Open
axeldavy opened this issue Jun 26, 2018 · 4 comments
Open

No perfcounters in the generated csv #13

axeldavy opened this issue Jun 26, 2018 · 4 comments

Comments

@axeldavy
Copy link

Hi,

I use the legacy opencl driver.
As CodeXL 2.5 features an old version of rcprof that hangs with the latest legacy opencl driver, I've followed the instructions here to compile the latest version of rcprof.

This worked out nicely, except that the debug info doesn't include any perfcounters.
My card is an RX480.

rcprof -l gives me:
`The list of valid counters for Graphics IP v6 based graphics cards:
Wavefronts, VALUInsts, SALUInsts, VFetchInsts, SFetchInsts,
VWriteInsts, LDSInsts, GDSInsts, VALUUtilization, VALUBusy,
SALUBusy, FetchSize, WriteSize, CacheHit, MemUnitBusy,
MemUnitStalled, WriteUnitStalled, LDSBankConflict

The list of valid counters for Graphics IP v7 based graphics cards:
Wavefronts, VALUInsts, SALUInsts, VFetchInsts, SFetchInsts,
VWriteInsts, FlatVMemInsts, LDSInsts, FlatLDSInsts, GDSInsts,
VALUUtilization, VALUBusy, SALUBusy, FetchSize, WriteSize,
CacheHit, MemUnitBusy, MemUnitStalled, WriteUnitStalled, LDSBankConflict

The list of valid counters for Graphics IP v8 based graphics cards:
Wavefronts, VALUInsts, SALUInsts, VFetchInsts, SFetchInsts,
VWriteInsts, FlatVMemInsts, LDSInsts, FlatLDSInsts, GDSInsts,
VALUUtilization, VALUBusy, SALUBusy, FetchSize, WriteSize,
CacheHit, MemUnitBusy, MemUnitStalled, WriteUnitStalled, LDSBankConflict

The list of valid counters for Vega based graphics cards:
Wavefronts, VALUInsts, SALUInsts, VFetchInsts, SFetchInsts,
VWriteInsts, FlatVMemInsts, LDSInsts, FlatLDSInsts, GDSInsts,
VALUUtilization, VALUBusy, SALUBusy, FetchSize, WriteSize,
L1CacheHit, L2CacheHit, MemUnitBusy, MemUnitStalled, WriteUnitStalled,
LDSBankConflict`

However rcprof --listactive doesn't return anything.

If found out I could fix CodeXL 2.5 by just replacing libRCPCLProfileAgent, but then the perf counters are missing I well. I guess this indicates the issue is related to this library.

@axeldavy
Copy link
Author

Using old driver 17.50 (18.10 won't do), the perfcounters appear. I hope this info helps.

@chesik-amd
Copy link
Collaborator

If you get a chance, can you please try this with RCP v5.6 which was released last week, and let me know if this issues still reproduces. There were a few issues fixed with regards to using legacy OpenCL in amdgpu-pro

Thanks

@axeldavy
Copy link
Author

I have issues building it. I had the same issues with previous builds, but I don't remember how to fix them.
First I have to remove all -Werror (new gccs add new warnings all the time).
Second sprofile fails to build with the error
ParseCmdLine.cpp:(.text.startup+0x2): undefined reference to boost::system::generic_category()' /usr/bin/ld: ParseCmdLine.cpp:(.text.startup+0x7): undefined reference to boost::system::generic_category()'
/usr/bin/ld: ParseCmdLine.cpp:(.text.startup+0xc): undefined reference to `boost::system::system_category()'

@biergaizi
Copy link

biergaizi commented Sep 8, 2023

I have issues building it. I had the same issues with previous builds, but I don't remember how to fix them.
First I have to remove all -Werror (new gccs add new warnings all the time).
Second sprofile fails to build with the error:

ParseCmdLine.cpp:(.text.startup+0x2): undefined reference to boost::system::generic_category()'
/usr/bin/ld: ParseCmdLine.cpp:(.text.startup+0x7): undefined reference to boost::system::generic_category()'
/usr/bin/ld: ParseCmdLine.cpp:(.text.startup+0xc): undefined reference to `boost::system::system_category()'

I encountered the same problem. It seems that this software package is no longer maintained, there are many outdated identifiers, missing library includes, and other problems. These problems are relatively easy to fix, but I was completely puzzled by the final Boost linking undefined reference problem, it's the most tricky one that took me several hours to find the answer: it turned out to be a C++ ABI compatibility problem because rcprof was building using the bundled Boost headers from Boost 1.59, but at link time I was using the system Boost library binaries, creating ABI conflicts. This is clearly not guaranteed to work. To fix that, there are two ways.

  1. The proper way. Change BOOST_DIR = $(COMMON_LIB_EXT)/Boost/boost_1_59_0 to the Boost source tree, the version should be the same as your system's version, for example BOOST_DIR = /home/user/boost_1_83_0/. If your system doesn't provide static .a files for Boost, just compile Boost from source, then set BOOST_LIB_DIR to be the build output of the same Boost version, such as BOOST_LIB_DIR = /home/user/boost_1_83_0/stage/lib. Otherwise, there will be undefined references to boost::filesystem::path_traits.

  2. The coincidence. My experiment showed that if Boost 1.76 is built using --std=c++11, linking against Boost 1.76 would work even if the source is compiled with Boost 1.59. But this is entirely a coincidence. If Boost is built using a higher C++ standard, there will be undefined reference to boost::system::generic_category().

Unfortunately, upon further investigation, even the "fixed" build still has problems. Only HSA profiling works, but not OpenCL or occupancy profiling. It turned out that rocprof is just a wrapper for various underlying modules, such as HSAFdnTrace, HSAFdnPMC, CLOccupancyAgent, CLProfileAgent, CLTraceAgent - all rocprof does is registers these modules and writes a configuration file before the user application runs, it's up to the modules themselves to do the actual work. This is why rocprof only reports Failed to generate profile result at the very end - it only reads an output file and has no idea about how the modules work by themselves - in this case, they're never executed.

For HSA and OpenCL, different loading mechanisms are used. For OpenCL and occupancy, it uses a method similar to LD_PRELOAD by replacing the OpenCL functions with instrumented versions. However, for reason unknown, the entry point is no longer invoked (possibility due to incompatible SDK change)? So now CLOccupancyAgent, CLProfileAgent, CLTraceAgent can no longer be invoked anywhere, their entry point cl_int CL_CALLBACK clAgent_OnLoad(cl_agent* agent) never executes.

Without someone who has a deeper understand about how the library preloading actually works, I'm afraid that this problem cannot be fixed. If anyone's interested and wishes to continue the investigation, reply to this thread and I can send a list of patches to replicate my build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants