You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In GPTCache/gptcache/adapter/adapter.py, after searching data from vector db, there is a for loop (line 379) to call get_scalar_data and evaluation method in order to get the rank of each data. However, some rerank models support batch inference which allows batch evaluation. Is there a way to perform batch similarity evaluations at once instead of executing them serially?
Why is this needed?
Batch inference means that the model will only be called once, and the performance will be better.
Anything else?
No response
The text was updated successfully, but these errors were encountered:
Laarryliu
changed the title
[Enhancement]: Batch evaluate similarit for all searched data from vector database
[Enhancement]: Batch evaluate similarity for all searched data from vector database
Sep 3, 2024
What would you like to be added?
In GPTCache/gptcache/adapter/adapter.py, after searching data from vector db, there is a for loop (line 379) to call get_scalar_data and evaluation method in order to get the rank of each data. However, some rerank models support batch inference which allows batch evaluation. Is there a way to perform batch similarity evaluations at once instead of executing them serially?
Why is this needed?
Batch inference means that the model will only be called once, and the performance will be better.
Anything else?
No response
The text was updated successfully, but these errors were encountered: