TensorRT-LLM 0.5.0 Releases #55
juney-nvidia
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi TensorRT-LLM users,
We are very pleased to have released the very first public version of TensorRT-LLM. It has been an intense effort to create this project and we hope that it will enable you to easily deploy GPU-based inference for state-of-the-art LLMs. We want TensorRT-LLM to help you run those LLMs very fast.
Currently, there are two key branches in the project:
The release/0.5.0 branch contains what we'd call the stable code of the first release of TensorRT-LLM (version 0.5.0). It has been QA-ed and carefully tested,
The main branch contains what we'd call the dev code. It is more experimental.
We plan to update the main branch regularly with new features, bug fixes and performance optimizations. The stable branch(es) will be updated less frequently. The exact frequencies will depend on your feedback.
Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions