-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Browser management. #172
Comments
It would be great! At every loop step such tool maight create a pdf screenshot and give it as input for a multimodal model to recognize. |
yes exactly , imagine configuration for your agent-zero on a trading website platform running it on automatic with out strees insane . |
I have mentioned this in the discord a couple weeks or so ago-- basically some oss agent-q type functionality or possibly an integration of agent-q which could be called like a tool. It's on my todo list to work on. I've just been super busy. But I will work on something like this when I get a chance. |
great , will be waiting to collaborate if its possible . |
please add this! |
Tthis would be great! +1 👍 |
Can one imagine that Agent-Zero is piloting other frameworks, for example to use openinterpreter or anthropic computer use, to achieve this? This would mean that the docker machine needs to be a full "computer" with the ability to launch a browser. |
The full potential of Agent-Zero would be possible if it were
feasible to integrate Google browser management in a visual
way, like asking to open YouTube and having it do so. Let's say
you want to automate interaction with a website instead of
creating a full bot, doing it with Agent Zero running
automatically by giving only the prompts? It should be able to
read or snapshot the website to understand what to do there?
Right now, it's not possible.
The text was updated successfully, but these errors were encountered: