Guardrails is very slow. How to improve speed ? #587

NirnayK · 2024-06-28T09:37:32Z

NirnayK
Jun 28, 2024

Everything done here was tested and performed on a vllm of llama 3 8b instruct on an A100 80GB
I was running a guardrails on C3 8 cpu 16GB RAM
I have done some test and here is what i have found :
Base model response is generally in 100s of milli seconds (EVEN IF I GIVE IT THE VECTOR DB DATA MYSELF AND ASK IT IN A QUERY)
Nemo Guardrails (Bare Bones NO KB, NO Colang or anything) takes around 3.5s
Nemo Guardrails (Qdrant VectorDB, No colang , No input and output , dialog rails) takes around 10 to 11s

No my issues is why does it take so much time to generate query with a vector DB ? Because the total latency of all calls (LLM, VectorDB shouldnt be more than 500ms) so there shouldn't be such a difference.

Is there something that I am missing ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guardrails is very slow. How to improve speed ? #587

{{title}}

Replies: 0 comments

Select a reply

Guardrails is very slow. How to improve speed ? #587

NirnayK Jun 28, 2024

Replies: 0 comments

NirnayK
Jun 28, 2024