Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuner stuck in 'dead lock' and never completes #546

Open
diverger opened this issue Jul 5, 2024 · 3 comments
Open

Tuner stuck in 'dead lock' and never completes #546

diverger opened this issue Jul 5, 2024 · 3 comments
Labels

Comments

@diverger
Copy link

diverger commented Jul 5, 2024

Hi,
When running the 'make alltuners' on a Mali GPU, some tunes run hours long. And finally it stuck there and never return. Are there any methods to speed up?

@CNugteren
Copy link
Owner

There are a few ways.

First of all, you could modify the tuner's file, e.g. CLBlast/src/tuning/kernels/xgemm.hpp and reduce the number of parameters in settings.parameters in multiple places, e.g. change {16, 32, 64} into {16, 32} for example.

Secondly, you could change the --fraction command-line argument (of e.g. clblast_tuner_xgemm) to something below 1.0 to not test everything.

Thirdly, you could tune only for the precision you need, e.g. single-precision (32) float only, and skip the other tuners. Basically make alltuners first compiles everything and then runs all the tuners (e.g. ./clblast_tuner_xgemm --precision 32) for all precisions after each other.

Lastly, for GEMM specifically there are 4 parts being tuned (from CLBlast/src/tuning/kernels/xgemm.cpp):

    printf("* (1/4) Tuning main GEMM kernel (GEMMK == 0) for fixed set of parameters\n\n");
    StartVariation<1>(argc, argv);
    printf("* (2/4) Tuning main GEMM kernel (GEMMK == 0) for random parameters out of larger set\n\n");
    StartVariation<2>(argc, argv);
    printf("* (3/4) Tuning secondary GEMM kernel (GEMMK == 1) for fixed set of parameters\n\n");
    StartVariation<11>(argc, argv);
    printf("* (4/4) Tuning secondary GEMM kernel (GEMMK == 1) for random parameters out of larger set\n\n");
    StartVariation<12>(argc, argv);

You could skip steps 2/4 and 4/4 to save time.

@diverger
Copy link
Author

diverger commented Jul 12, 2024

Can I achieve these by modifying the CMakefileList.txt?

@CNugteren
Copy link
Owner

No, I don't think so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants