You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to the large size of the ImageNet dataset, I am using the MiniImageNet dataset. I modified the YAML file accordingly.
datasets: target: flava.definitions.TrainingDatasetsInfo
selected:
- image
- vl
- text
image: target: flava.definitions.TrainingSingleDatasetInfo
train:
- target: flava.definitions.HFDatasetInfo
key: mini_train
subset: default
data_dir: >-
/home/liumaofu/hyy/multimodal/examples/flava/mini/ok/train/
val:
- target: flava.definitions.HFDatasetInfo
key: mini_val
subset: default
data_dir: >-
/home/liumaofu/hyy/multimodal/examples/flava/mini/ok/val/
At the same time, I modified the examples/flava/data/utils. py file:
def build_datasets_from_info(dataset_infos: List[HFDatasetInfo], split: str = "train"):
dataset_list = []
for dataset_info in dataset_infos:
print(f"Loading dataset from {dataset_info.data_dir}")
current_dataset = load_from_disk(dataset_info.data_dir)
if dataset_info.remove_columns is not None:
current_dataset = current_dataset.remove_columns(dataset_info.remove_columns)
if dataset_info.rename_columns is not None:
for rename in dataset_info.rename_columns:
current_dataset = current_dataset.rename_column(rename[0], rename[1])
dataset_list.append(current_dataset)
return concatenate_datasets(dataset_list)
However, when executing the code:python -m flava.train config=flava/configs/pretraining/debug.yaml
, an error is reported:Directory /home/liumaofu/hyy/multimodal/examples/flava/mini/ok/train/ is neither a dataset directory nor a dataset dict directory.
The structure of my miniimagenet dataset is as follows:
miniImagenet
|-- train
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ...
|-- val
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ...
|-- test
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ..
I ensure that their storage path is not a problem. May I ask why this error is reported and what should I do?
The text was updated successfully, but these errors were encountered:
Hi @HeYiyang2 apologies for the delayed response. How did you download the local dataset? I think load_from_disk should only be used in cases where the directory is created as a result of a call to save_to_disk. See e.g. this comment
Due to the large size of the ImageNet dataset, I am using the MiniImageNet dataset. I modified the YAML file accordingly.
datasets:
target: flava.definitions.TrainingDatasetsInfo
selected:
- image
- vl
- text
image:
target: flava.definitions.TrainingSingleDatasetInfo
train:
- target: flava.definitions.HFDatasetInfo
key: mini_train
subset: default
data_dir: >-
/home/liumaofu/hyy/multimodal/examples/flava/mini/ok/train/
val:
- target: flava.definitions.HFDatasetInfo
key: mini_val
subset: default
data_dir: >-
/home/liumaofu/hyy/multimodal/examples/flava/mini/ok/val/
At the same time, I modified the examples/flava/data/utils. py file:
def build_datasets_from_info(dataset_infos: List[HFDatasetInfo], split: str = "train"):
dataset_list = []
for dataset_info in dataset_infos:
print(f"Loading dataset from {dataset_info.data_dir}")
However, when executing the code:python -m flava.train config=flava/configs/pretraining/debug.yaml
, an error is reported:Directory /home/liumaofu/hyy/multimodal/examples/flava/mini/ok/train/ is neither a dataset directory nor a dataset dict directory.
The structure of my miniimagenet dataset is as follows:
miniImagenet
|-- train
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ...
|-- val
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ...
|-- test
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ..
I ensure that their storage path is not a problem. May I ask why this error is reported and what should I do?
The text was updated successfully, but these errors were encountered: