This repository has been archived by the owner on Sep 27, 2024. It is now read-only.

Inference optimization #57

Open

ngoiyaeric opened this issue Dec 27, 2023 · 0 comments

ngoiyaeric commented Dec 27, 2023

hey your models aren't as fast as https://labs.perplexity.ai/ have you guys used https://github.com/NVIDIA/TensorRT-LLM
for your optimization

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.