Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Regression in AgentController broke AgentDelegationAction #6162

Closed
1 task done
diwu-sf opened this issue Jan 9, 2025 · 9 comments · Fixed by #6165
Closed
1 task done

[Bug]: Regression in AgentController broke AgentDelegationAction #6162

diwu-sf opened this issue Jan 9, 2025 · 9 comments · Fixed by #6165
Labels
bug Something isn't working severity:medium Affecting multiple users

Comments

@diwu-sf
Copy link
Contributor

diwu-sf commented Jan 9, 2025

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Describe the bug and reproduction steps

#5868 seem to have broken agent delegation and multi agent.

    def should_step(self, event: Event) -> bool:
        if isinstance(event, Action):
            if isinstance(event, MessageAction) and event.source == EventSource.USER:
                return True
            return False
        if isinstance(event, Observation):
            if isinstance(event, NullObservation) or isinstance(
                event, AgentStateChangedObservation
            ):
                return False
            return True
        return False

The should step is too restrictive and doesn't allow any agent delegation.

I think there should be tests introduced to catch future regressions with AgentDelegation.

I'm curious how the CodeActAgent -> BrowsingAgent delegation is working right now on HEAD because the AgentDelegateAction doesn't step() according to the code

OpenHands Installation

Docker command in README

OpenHands Version

No response

Operating System

None

Logs, Errors, Screenshots, and Additional Context

No response

@diwu-sf diwu-sf added the bug Something isn't working label Jan 9, 2025
@li-boxuan
Copy link
Collaborator

li-boxuan commented Jan 9, 2025

First, I didn't read the bug report you filed; sorry! Hopefully someone else could take a peek. I'd like to comment on two things:

I think there should be tests introduced to catch future regressions with AgentDelegation.

History is a bit complicated here. We had integration tests for agent delegation at some point (which was introduced because a refactoring PR broke this functionality). Those tests were then removed at some point due to 1) non-determinism introduced by the LLM-based editing, 2) the daily pipelines that run mini evaluation (I am not sure if those suites include delegation).

I'm curious how the CodeActAgent -> BrowsingAgent delegation is working right now on HEAD

On HEAD, CodeActAgent doesn't delegate to BrowsingAgent. It handles browsing by itself.

@diwu-sf
Copy link
Contributor Author

diwu-sf commented Jan 9, 2025

The DelegatorAgent on HEAD should also be equally broken at the moment.
Without delegation working, all multi-agent configurations are broken. I think we should bring back some level of simple agent delegation with deterministic simple tasks back into the daily tests.

@li-boxuan
Copy link
Collaborator

li-boxuan commented Jan 9, 2025

I think we should bring back some level of simple agent delegation with deterministic simple tasks back into the daily tests.

Yeah I agree, that's what I wanted to achieve with #6049

FWIW we used to run integration tests with mocked prompts & responses from LLMs, which was not very dev-friendly whenever they do even just a little change to the prompt.

@yufansong
Copy link
Collaborator

I think we should bring back some level of simple agent delegation with deterministic simple tasks back into the daily tests.

Yeah I agree, that's what I wanted to achieve with #6049

FWIW we used to run integration tests with mocked prompts & responses from LLMs, which was not very dev-friendly whenever they do even just a little change to the prompt.

@li-boxuan do we still delegate action to other agents? I though we move all needed action to codeact agent. 🤔

@neubig
Copy link
Contributor

neubig commented Jan 9, 2025

Yeah, CodeActAgent doesn't use delegation anymore, which is also probably why we didn't notice that this broke. I think the core use case of OpenHands doesn't need this, but if other users need it it should probably be fixed (as long as it doesn't cause too much maintenance burden).

@diwu-sf
Copy link
Contributor Author

diwu-sf commented Jan 9, 2025

"I think the core use case of OpenHands doesn't need this, but if other users need it it should probably be fixed (as long as it doesn't cause too much maintenance burden)."

I think this is core usecase to OpenHands as a usable "multi" agent framework, even if CodeAct is a non delegating agent.

Is OpenHands a multi agent framework or is it just a UI on top of CodeAct?

@enyst enyst mentioned this issue Jan 9, 2025
3 tasks
@enyst
Copy link
Collaborator

enyst commented Jan 9, 2025

I think there should be tests introduced to catch future regressions with AgentDelegation.

As @li-boxuan said, we have deeply changed the integration tests, and we haven't yet given the new tests enough love. Sorry about that. Specifically, the new integration tests only run CodeAct, which means all other agents are not tested on the full execution flow for a while now. That includes DelegatorAgent, and with it, the core functionality of delegation.

I intended to add tests for the other agents. Added in the linked PR a couple for DelegatorAgent, to see how the PR changes work.

@enyst
Copy link
Collaborator

enyst commented Jan 9, 2025

Is OpenHands a multi agent framework or is it just a UI on top of CodeAct?

This is a fair question for more reasons than one, and a good discussion to have.

To be clear, my fixing of this issue is separate from that discussion. It bothers me that we have core functionality that hasn't been tested anymore for a couple of months, and now it broke, as it obviously would have someday!

On the related note, may I ask how are you using delegation, are you using these micro-agents we have now, or just Delegator with your own, or not even Delegator? How are you finding the delegation feature, was it working for you?

@diwu-sf
Copy link
Contributor Author

diwu-sf commented Jan 9, 2025

We built our own domain specific agent, there's 3+ different roles implemented using AgentDelegateAction and it's more reliable at staying on track than CodeAct agent when the tasks take more than 5 mins of iteration. CodeAct agent is used as one of the sub-agent that get coding tasks delegated to, but we also have non-coding agents to "focus" on other parts of the workflows.

We are not using the DelegatorAgent (https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/delegator_agent) but that one served as an example for us to learn how agent delegation should be implemented in OpenHands framework.

I suspect if integration tests are introduced to ensure that DelegatorAgent always executed correctly, then it would cover our architecture.

Thanks for the bug fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working severity:medium Affecting multiple users
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants