With the new TensorRT-LLM 0.9.0, ModelConfig in tesnorrt_llm.runtime.generation now has new args max_batch_size, max_beam_width - how do we set these? #1505

digitalmonkey · 2024-04-26T11:27:38Z

digitalmonkey
Apr 26, 2024

The new version of Tensorrt_llm introduced new arguments to ModelConfig - max_batch_size, max_beam_width.
How do we set these? Specifically, this breaks the trt_llama_api.py script that is used to build the rag on windows. I'm trying to run that on Ubuntu and wish to stay up to date with tensorrt_llm's latest release.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With the new TensorRT-LLM 0.9.0, ModelConfig in tesnorrt_llm.runtime.generation now has new args max_batch_size, max_beam_width - how do we set these? #1505

{{title}}

Replies: 0 comments

Select a reply

With the new TensorRT-LLM 0.9.0, ModelConfig in tesnorrt_llm.runtime.generation now has new args max_batch_size, max_beam_width - how do we set these? #1505

digitalmonkey Apr 26, 2024

Replies: 0 comments

digitalmonkey
Apr 26, 2024