-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YOLOS model extremely slow #533
Comments
Can you try using the unquantized version? Done by specifying: const pipe = await pipeline('task', 'model', { quantized: false }); |
It's slightly faster,
Maybe it's the image encoding step, I will try to measure each step |
I tried this:
and that gives:
here's the full session recorded with Firefox's profiler https://share.firefox.dev/497JdCi the function that is slow is |
System Info
latest wasm version
Environment/Platform
Description
I am trying to run https://huggingface.co/hustvl/yolos-tiny using a quantized version (similar to Xenova/yolos-tiny) and it works by using the
object-detection
pipeline but it is extremely slow.An image that gets infered using the same model in transformers python takes around 15 seconds on my M1. The python version takes 190 ms.
I tried to run the web dev tool, and the curlpit is in the ONNX runtime at wasm-function[10863] @ ort-wasm-simd.wasm:0x801bfa but I don't have the debug symbols so it's kind of useless...
Is there a way to force transformers.js to run with a debug version of the ort runtime?
Reproduction
Runs the object detection demo at https://xenova.github.io/transformers.js/, swap the detr-resnet model with the yolo-tiny
The text was updated successfully, but these errors were encountered: