Skip to content

Commit

Permalink
update data download script
Browse files Browse the repository at this point in the history
  • Loading branch information
JamesZhutheThird committed Mar 4, 2024
1 parent e781345 commit 589d5f3
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 43 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@ For more details, please visit our [leaderboard]() (Coming Soon).

## ⏬ Download

You can download the dataset from the [HuggingFace Page](https://huggingface.co/datasets/OpenDFM/MULTI-Benchmark). Current [version](https://huggingface.co/datasets/OpenDFM/MULTI-Benchmark/blob/main/MULTI_v1.2.2_20240212_release.zip) is `v1.2.2`. Unzip the files and put them under `data`.
You can simply download data using the following command:

```
wget https://huggingface.co/datasets/OpenDFM/MULTI-Benchmark/resolve/main/MULTI_v1.2.2_20240212_release.zip
unzip MULTI_v1.2.2_20240212_release.zip -d ./data/
```shell
cd eval
python download_data.py
```

The structure of `data` should be something like:
The structure of `./data` should be something like:

```
data
./data
├── images # folder containing images
├── problem_v1.2.2_20240212_release.json # MULTI
├── knowledge_v1.2.2_20240212_release.json # MULTI-Extend
Expand Down
12 changes: 6 additions & 6 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@

## ⏬ 下载

您可以从[HuggingFace页面](https://huggingface.co/datasets/OpenDFM/MULTI-Benchmark)下载数据集。最新[版本](https://huggingface.co/datasets/OpenDFM/MULTI-Benchmark/blob/main/MULTI_v1.2.2_20240212_release.zip)`v1.2.2`。解压文件并将它们放置在`data`下。
您只需使用以下命令即可下载数据:

```
wget https://huggingface.co/datasets/OpenDFM/MULTI-Benchmark/resolve/main/MULTI_v1.2.2_20240212_release.zip
unzip MULTI_v1.2.2_20240212_release.zip -d ./data/
```shell
cd eval
python download_data.py
```

`data` 的结构应该如下所示:
`./data` 的结构应该如下所示:

```
data
./data
├── images # 包含图片的文件夹
├── problem_v1.2.2_20240212_release.json # MULTI
├── knowledge_v1.2.2_20240212_release.json # MULTI-Extend
Expand Down
31 changes: 0 additions & 31 deletions data/README.md

This file was deleted.

15 changes: 15 additions & 0 deletions eval/download_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from datasets import load_dataset
import os
import shutil

if os.path.exists("../cache"):
shutil.rmtree("../cache")
os.makedirs("../cache")

load_dataset("OpenDFM/MULTI-Benchmark", cache_dir="../cache")

random_string = os.listdir("../cache/downloads/extracted")[0]

shutil.copytree(f"../cache/downloads/extracted/{random_string}/", "../data/", dirs_exist_ok=True)

shutil.rmtree("../cache")

0 comments on commit 589d5f3

Please sign in to comment.