don't check max_batch_size for cpu #298

yufenglee · 2024-04-23T17:14:37Z

No description provided.

wangyems · 2024-04-23T17:19:30Z

for cpu models, we don't expect users to execute params.try_use_cuda_graph_with_max_batch_size(1). I think we can add a check to conditionally pass try_use_cuda_graph...() in the model-qa.py

yufenglee · 2024-04-23T17:22:36Z

for cpu models, we don't expect users to execute params.try_use_cuda_graph_with_max_batch_size(1). I think we can add a check to conditionally pass try_use_cuda_graph...() in the model-qa.py

if user called it, it should not crash. I think we can add a warning later instead of quit.

baijumeswani · 2024-04-23T17:32:51Z

if user called it, it should not crash. I think we can add a warning later instead of quit.

It would still crash (throw an exception) since you're checking for cuda graph enabled in cpu ep, no?

yufenglee · 2024-04-23T17:35:28Z

h (throw an exception) since you're checking for cuda graph enabled in cpu ep, no?
The PR prevents quit if user calls: params.try_use_cuda_graph_with_max_batch_size(1)

onnxruntime-genai/examples/python/model-qa.py

Line 31 in 352c1ec

params.try_use_cuda_graph_with_max_batch_size(1)

It will still crash if user set to enable cuda_graph for cpu

baijumeswani · 2024-04-23T17:36:54Z

How is that any different from what it was earlier? It appears that max_batch_size_ is only set when try_use_cuda_graph_with_max_batch_size is called.

yufenglee · 2024-04-23T17:38:32Z

How is that any different from what it was earlier? It appears that max_batch_size_ is only set when try_use_cuda_graph_with_max_batch_size is called.

It will not crash with this PR if user call try_use_cuda_graph_with_max_batch_size for CPU device

baijumeswani · 2024-04-23T17:41:23Z

Sorry, I didn't check the definition of IsCudaGraphEnabled. PR looks good.

don't check max_batch_size for cpu

d276d95

baijumeswani approved these changes Apr 23, 2024

View reviewed changes

yufenglee merged commit b1180a6 into main Apr 23, 2024
11 of 12 checks passed

yufenglee deleted the yufeng/cuda_graph branch April 23, 2024 17:39

PatriceVignola pushed a commit that referenced this pull request Apr 23, 2024

don't check max_batch_size for cpu (#298)

a34667c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

don't check max_batch_size for cpu #298

don't check max_batch_size for cpu #298

yufenglee commented Apr 23, 2024

wangyems commented Apr 23, 2024

yufenglee commented Apr 23, 2024

baijumeswani commented Apr 23, 2024

yufenglee commented Apr 23, 2024

baijumeswani commented Apr 23, 2024

yufenglee commented Apr 23, 2024

baijumeswani commented Apr 23, 2024

don't check max_batch_size for cpu #298

don't check max_batch_size for cpu #298

Conversation

yufenglee commented Apr 23, 2024

wangyems commented Apr 23, 2024

yufenglee commented Apr 23, 2024

baijumeswani commented Apr 23, 2024

yufenglee commented Apr 23, 2024

baijumeswani commented Apr 23, 2024

yufenglee commented Apr 23, 2024

baijumeswani commented Apr 23, 2024