Replies: 1 comment
-
Hi @rm-kozgun, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm performing inference in c++ using openvino 2023.3. I currently have a f32 precision model, and compile with f32 inference precision and ExecutionMode::PERFORMANCE. Using my GPU, I see a good performance boost and ~60% runtime reduction over CPU.
I'd like to further optimize runtime, so I've produced a comparable model using f16 precision. I've made three observations:
Am I implementing something wrong?
In my code, I'm adjusting these settings:
compiled_model = core_.compile_model(model, "CPU",
ov::hint::execution_mode(ov::hint::ExecutionMode::ACCURACY),
ov::hint::inference_precision(ov::element::f32));
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions