Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The time run with CPU are much faster than GPU? #5849

Closed
zengjie617789 opened this issue Dec 25, 2024 · 6 comments
Closed

The time run with CPU are much faster than GPU? #5849

zengjie617789 opened this issue Dec 25, 2024 · 6 comments

Comments

@zengjie617789
Copy link

detail | 详细描述 | 詳細な説明

I transfered the benchmark code on mobile phone and got the reuslts with own model, but the conclusion is that CPU are much faster than GPU, I donot know if it is run gpu successfully because of that below:
image

cpu cost:
image

gpu cost:
image

@wzyforgit
Copy link
Contributor

1.算子没实现,转回去用cpu算,多个拷贝操作
2.GPU太弱,CPU太强

@zengjie617789
Copy link
Author

谢谢你的即时回答,我想继续问一下:

  1. 如何定位那些没有实现的算子? 因为运行期间没有看到错误;
  2. 这一点可以排除,因为运行同样的示例,比如 ncnn_android_squeezenet能够很明显的判别两种模式的速度差; 为什么同样的squeezenet结果会相差这么大? 是否意味着ncnn_android_squeezenet里的mode是经过算子优化的? 因为其提供的model是二进制文件,无法再仔细研究。
    谢谢你的任何帮助。

@wzyforgit
Copy link
Contributor

1.两个方法,打印算子运行时的耗时,或者一个一个去代码文件里面找
2.model是有可视化工具的 https://netron.app/

@zengjie617789
Copy link
Author

1.两个方法,打印算子运行时的耗时,或者一个一个去代码文件里面找 2.model是有可视化工具的 https://netron.app/

谢谢。再次请问下

  1. ”打印每次算子运行时耗时“如何打印,因为推理时只有获取结果:
ex.input(0, input_mat);
ncnn::Mat out;
ex.extract(499, out);

无法就每个算子打印时间吧?
2. netron可以看model结构,但是示例中的squeezenet的params是二进制的,无法预览其结构,有什么其他办法么?

@wzyforgit
Copy link
Contributor

image

image

@zengjie617789
Copy link
Author

谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants