Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't check max_batch_size for cpu #298

Merged
merged 1 commit into from
Apr 23, 2024
Merged

don't check max_batch_size for cpu #298

merged 1 commit into from
Apr 23, 2024

Conversation

yufenglee
Copy link
Member

No description provided.

@wangyems
Copy link
Contributor

for cpu models, we don't expect users to execute params.try_use_cuda_graph_with_max_batch_size(1). I think we can add a check to conditionally pass try_use_cuda_graph...() in the model-qa.py

@yufenglee
Copy link
Member Author

for cpu models, we don't expect users to execute params.try_use_cuda_graph_with_max_batch_size(1). I think we can add a check to conditionally pass try_use_cuda_graph...() in the model-qa.py

if user called it, it should not crash. I think we can add a warning later instead of quit.

@baijumeswani
Copy link
Collaborator

if user called it, it should not crash. I think we can add a warning later instead of quit.

It would still crash (throw an exception) since you're checking for cuda graph enabled in cpu ep, no?

@yufenglee
Copy link
Member Author

h (throw an exception) since you're checking for cuda graph enabled in cpu ep, no?
The PR prevents quit if user calls: params.try_use_cuda_graph_with_max_batch_size(1)

params.try_use_cuda_graph_with_max_batch_size(1)

It will still crash if user set to enable cuda_graph for cpu

@baijumeswani
Copy link
Collaborator

How is that any different from what it was earlier? It appears that max_batch_size_ is only set when try_use_cuda_graph_with_max_batch_size is called.

@yufenglee
Copy link
Member Author

How is that any different from what it was earlier? It appears that max_batch_size_ is only set when try_use_cuda_graph_with_max_batch_size is called.

It will not crash with this PR if user call try_use_cuda_graph_with_max_batch_size for CPU device

@yufenglee yufenglee merged commit b1180a6 into main Apr 23, 2024
11 of 12 checks passed
@yufenglee yufenglee deleted the yufeng/cuda_graph branch April 23, 2024 17:39
@baijumeswani
Copy link
Collaborator

Sorry, I didn't check the definition of IsCudaGraphEnabled. PR looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants