This package provides browsergym.webarena
, which is an unofficial port of the WebArena benchmark for BrowserGym.
Note: the original WebArena codebase has been slightly adapted to ensure compatibility.
- Install the package
pip install browsergym-webarena
- Download tokenizer ressources
python -c "import nltk; nltk.download('punkt')"
- Setup the web servers (follow the webarena README).
BASE_URL=<YOUR_SERVER_URL_HERE>
- Setup the URLs as environment variables (note the
WA_
prefix)
export WA_SHOPPING="$BASE_URL:7770/"
export WA_SHOPPING_ADMIN="$BASE_URL:7780/admin"
export WA_REDDIT="$BASE_URL:9999"
export WA_GITLAB="$BASE_URL:8023"
export WA_WIKIPEDIA="$BASE_URL:8888/wikipedia_en_all_maxi_2022-05/A/User:The_other_Kiwix_guy/Landing"
export WA_MAP="$BASE_URL:3000"
export WA_HOMEPAGE="$BASE_URL:4399"
- Setup an OpenAI API key
export OPENAI_API_KEY=...
NOTE: be mindful of costs, as WebArena will call GPT4 for certain evaluations (llm_fuzzy_match).