This package provides browsergym.visualwebarena
, which is an unofficial port of the VisualWebArena benchmark for BrowserGym.
Note: the original VisualWebArena codebase has been slightly adapted to ensure compatibility.
- Install the package
pip install browsergym-visualwebarena
- Download tokenizer ressources
python -c "import nltk; nltk.download('punkt')"
-
Setup the web servers (follow the visualwebarena README).
-
Setup the URLs as environment variables (note the
VWA_
prefix)
export VWA_CLASSIFIEDS="$BASE_URL:9001/"
export VWA_CLASSIFIEDS_RESET_TOKEN="4b61655535e7ed388f0d40a93600254c" # Default reset token for classifieds site, change if you edited its docker-compose.yml
export VWA_SHOPPING="$BASE_URL:7770/"
export VWA_REDDIT="$BASE_URL:9999"
export VWA_WIKIPEDIA="$BASE_URL:8888/wikipedia_en_all_maxi_2022-05/A/User:The_other_Kiwix_guy/Landing"
export VWA_HOMEPAGE="$BASE_URL:4399"
- Setup an OpenAI API key
export OPENAI_API_KEY=...
NOTE: be mindful of costs, as VisualWebArena will call GPT4 for certain evaluations (llm_fuzzy_match).