Experiment codes for paper Mobile-Env: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction.
launch.sh
is the experiment launcher with text LLMs and launch_mm.sh
is the
launcher with MLMs. To launch the program,
Mobile-Env environment v4.0b1 should
be set up. WikiHow
task set v1.3 is used.
We use vision-ui for
Set-of-Marks. The model weights used by vision-ui can be downloaded according
to
https://github.com/Meituan-Dianping/vision-ui/blob/master/resources/vision_infer.md.
After downloading, place it under meituan_weights
folder.
@article{DanyangZhang2023_MobileEnv,
title = {{Mobile-Env}: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction},
author = {Danyang Zhang and
Zhennan Shen and
Rui Xie and
Situo Zhang and
Tianbao Xie and
Zihan Zhao and
Siyuan Chen and
Lu Chen and
Hongshen Xu and
Ruisheng Cao and
Kai Yu},
journal = {CoRR},
volume = {abs/2305.08144},
year = {2023},
url = {https://arxiv.org/abs/2305.08144},
eprinttype = {arXiv},
eprint = {2305.08144},
}