-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Regression in AgentController broke AgentDelegationAction #6162
Comments
First, I didn't read the bug report you filed; sorry! Hopefully someone else could take a peek. I'd like to comment on two things:
History is a bit complicated here. We had integration tests for agent delegation at some point (which was introduced because a refactoring PR broke this functionality). Those tests were then removed at some point due to 1) non-determinism introduced by the LLM-based editing, 2) the daily pipelines that run mini evaluation (I am not sure if those suites include delegation).
On HEAD, CodeActAgent doesn't delegate to BrowsingAgent. It handles browsing by itself. |
The |
Yeah I agree, that's what I wanted to achieve with #6049 FWIW we used to run integration tests with mocked prompts & responses from LLMs, which was not very dev-friendly whenever they do even just a little change to the prompt. |
@li-boxuan do we still delegate action to other agents? I though we move all needed action to codeact agent. 🤔 |
Yeah, CodeActAgent doesn't use delegation anymore, which is also probably why we didn't notice that this broke. I think the core use case of OpenHands doesn't need this, but if other users need it it should probably be fixed (as long as it doesn't cause too much maintenance burden). |
"I think the core use case of OpenHands doesn't need this, but if other users need it it should probably be fixed (as long as it doesn't cause too much maintenance burden)." I think this is core usecase to OpenHands as a usable "multi" agent framework, even if CodeAct is a non delegating agent. Is OpenHands a multi agent framework or is it just a UI on top of CodeAct? |
As @li-boxuan said, we have deeply changed the integration tests, and we haven't yet given the new tests enough love. Sorry about that. Specifically, the new integration tests only run CodeAct, which means all other agents are not tested on the full execution flow for a while now. That includes DelegatorAgent, and with it, the core functionality of delegation. I intended to add tests for the other agents. Added in the linked PR a couple for DelegatorAgent, to see how the PR changes work. |
This is a fair question for more reasons than one, and a good discussion to have. To be clear, my fixing of this issue is separate from that discussion. It bothers me that we have core functionality that hasn't been tested anymore for a couple of months, and now it broke, as it obviously would have someday! On the related note, may I ask how are you using delegation, are you using these micro-agents we have now, or just Delegator with your own, or not even Delegator? How are you finding the delegation feature, was it working for you? |
We built our own domain specific agent, there's 3+ different roles implemented using AgentDelegateAction and it's more reliable at staying on track than CodeAct agent when the tasks take more than 5 mins of iteration. CodeAct agent is used as one of the sub-agent that get coding tasks delegated to, but we also have non-coding agents to "focus" on other parts of the workflows. We are not using the DelegatorAgent (https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/delegator_agent) but that one served as an example for us to learn how agent delegation should be implemented in OpenHands framework. I suspect if integration tests are introduced to ensure that DelegatorAgent always executed correctly, then it would cover our architecture. Thanks for the bug fix. |
Is there an existing issue for the same bug?
Describe the bug and reproduction steps
#5868 seem to have broken agent delegation and multi agent.
The should step is too restrictive and doesn't allow any agent delegation.
I think there should be tests introduced to catch future regressions with AgentDelegation.
I'm curious how the CodeActAgent -> BrowsingAgent delegation is working right now on HEAD because the AgentDelegateAction doesn't
step()
according to the codeOpenHands Installation
Docker command in README
OpenHands Version
No response
Operating System
None
Logs, Errors, Screenshots, and Additional Context
No response
The text was updated successfully, but these errors were encountered: