Tool learning for LLM #4

QuangTQV · 2024-08-01T16:19:09Z

I am currently working on a problem to rerank tools (retrieving the appropriate tool for LLM), but the cross-encoder models are not converging.
Here is an example:
query: give me btc price
tool: get token price
Is your model feasible for this task?

fschlatt · 2024-08-02T07:23:17Z

Could you provide a few extra details?

What exactly is a tool?
When you say a cross-encoder is not converging, do you mean you are fine-tuning a cross-encoder on your dataset and it's not learning correctly?

QuangTQV · 2024-08-02T08:41:10Z

Could you provide a few extra details?

What exactly is a tool?

When you say a cross-encoder is not converging, do you mean you are fine-tuning a cross-encoder on your dataset and it's not learning correctly?

Current LLMs are being directed towards building agent systems. An agent is a system based on LLMs by providing tool descriptions in prompts for the LLM
For example:
prompt = """You are my assistant. You are allowed to use the following tools to complete tasks:
Tool 1: name: Play Music, description: Used to play a song based on its name.
Tool 2: name: Summarize News, description: Summarizes today's news.
...
Tool n"""
When the number n increases significantly, including all tool descriptions in the prompt for the LLM leads to context explosion, reduces accuracy in tool invocation, and incurs substantial costs. Therefore, it is necessary to filter out redundant tools before inputting them into the LLM.

My approach filters tools in two steps:

Step 1: Use a bi-encoder (quite effective).
Step 2: Use a cross-encoder to re-rank (very poor performance, even after fine-tuning).
The data used for fine-tuning is structured as follows:

"query": The query text.
"pos": A list of useful tool descriptions that can help solve the query.
"neg": A list of tool descriptions that are not needed.

fschlatt · 2024-08-02T11:52:10Z

Thanks for the additional context. If I understand correctly, you want to rank tool descriptions based on a query specifying the need for a tool.

I'm surprised that a bi-encoder is more effective than a cross-encoder on this task and would assume that given enough high-quality training data, a cross-encoder will be substantially more effective.

That being said, the Set-Encoder most likely will not give you a substantial boost over a standard cross-encoder's effectiveness. The Set-Encoder excels when interactions between the items to be ranked are necessary. In this case, the tools can most likely be ranked independently from one another.

QuangTQV · 2024-08-05T03:53:00Z

Thanks for the additional context. If I understand correctly, you want to rank tool descriptions based on a query specifying the need for a tool.

I'm surprised that a bi-encoder is more effective than a cross-encoder on this task and would assume that given enough high-quality training data, a cross-encoder will be substantially more effective.

That being said, the Set-Encoder most likely will not give you a substantial boost over a standard cross-encoder's effectiveness. The Set-Encoder excels when interactions between the items to be ranked are necessary. In this case, the tools can most likely be ranked independently from one another.

I think set encoder can still be used, when a query needs the cooperation of many tools to complete.

fschlatt · 2024-08-05T06:54:08Z

Yes, that is a good point. In those cases, the Set-Encoder is likely to be more effective than a standard cross-encoder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool learning for LLM #4

Tool learning for LLM #4

QuangTQV commented Aug 1, 2024

fschlatt commented Aug 2, 2024

QuangTQV commented Aug 2, 2024

fschlatt commented Aug 2, 2024 •

edited

Loading

QuangTQV commented Aug 5, 2024

fschlatt commented Aug 5, 2024 •

edited

Loading

Tool learning for LLM #4

Tool learning for LLM #4

Comments

QuangTQV commented Aug 1, 2024

fschlatt commented Aug 2, 2024

QuangTQV commented Aug 2, 2024

fschlatt commented Aug 2, 2024 • edited Loading

QuangTQV commented Aug 5, 2024

fschlatt commented Aug 5, 2024 • edited Loading

fschlatt commented Aug 2, 2024 •

edited

Loading

fschlatt commented Aug 5, 2024 •

edited

Loading