-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support saving and loading npz file in offline evaluation mode. #201
Open
LeoXing1996
wants to merge
6
commits into
open-mmlab:master
Choose a base branch
from
LeoXing1996:save_npz
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
7dff6c6
add file dataset
LeoXing1996 c0c306f
add unit test for meta_keys is None
LeoXing1996 1c58c02
support npz saving and loading in offline evaluation
LeoXing1996 3c6d9e0
fix some comment
LeoXing1996 0a33613
fix bug when loading variables shape like ()
LeoXing1996 7ca9b49
Merge branch 'master' of github.com:open-mmlab/mmgeneration into save…
LeoXing1996 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Copyright (c) OpenMMLab. All rights reserved. | ||
import mmcv | ||
import numpy as np | ||
from torch.utils.data import Dataset | ||
|
||
from .builder import DATASETS | ||
from .pipelines import Compose | ||
|
||
|
||
@DATASETS.register_module() | ||
class FileDataset(Dataset): | ||
"""Uncoditional file Dataset. | ||
|
||
This dataset load data information from files for training GANs. Given | ||
the path of a file, we will load all information in the file. The | ||
transformation on data is defined by the pipeline. Please ensure that | ||
``LoadImageFromFile`` is not in your pipeline configs because we directly | ||
get images in ``np.ndarray`` from the given file. | ||
|
||
Args: | ||
file_path (str): Path of the file. | ||
img_keys (str): Key of the images in npz file. | ||
pipeline (list[dict | callable]): A sequence of data transforms. | ||
test_mode (bool, optional): If True, the dataset will work in test | ||
mode. Otherwise, in train mode. Default to False. | ||
npz_keys (str | list[str], optional): Key of the images to load in the | ||
npz file. Must with the input file is as npz file. | ||
""" | ||
|
||
_VALID_FILE_SUFFIX = ('.npz') | ||
|
||
def __init__(self, file_path, pipeline, test_mode=False): | ||
super().__init__() | ||
assert any([ | ||
file_path.endswith(suffix) for suffix in self._VALID_FILE_SUFFIX | ||
]), (f'We only support \'{self._VALID_FILE_SUFFIX}\' in this dataset, ' | ||
f'but receive {file_path}.') | ||
|
||
self.file_path = file_path | ||
self.pipeline = Compose(pipeline) | ||
self.test_mode = test_mode | ||
self.load_annotations() | ||
|
||
# print basic dataset information to check the validity | ||
mmcv.print_log(repr(self), 'mmgen') | ||
|
||
def load_annotations(self): | ||
"""Load annotations.""" | ||
if self.file_path.endswith('.npz'): | ||
data_info, data_length = self._load_annotations_from_npz() | ||
data_fetch_fn = self._npz_data_fetch_fn | ||
|
||
self.data_infos = data_info | ||
self.data_fetch_fn = data_fetch_fn | ||
self.data_length = data_length | ||
|
||
def _load_annotations_from_npz(self): | ||
"""Load annotations from npz file and check number of samples are | ||
consistent among all items. | ||
|
||
Returns: | ||
tuple: dict and int | ||
""" | ||
npz_file = np.load(self.file_path, mmap_mode='r') | ||
data_info_dict = dict() | ||
npz_keys = list(npz_file.keys()) | ||
|
||
# checnk num samples | ||
num_samples = None | ||
for k in npz_keys: | ||
data_info_dict[k] = npz_file[k] | ||
# check number of samples | ||
if num_samples is None: | ||
num_samples = npz_file[k].shape[0] | ||
else: | ||
assert num_samples == npz_file[k].shape[0] | ||
return data_info_dict, num_samples | ||
|
||
@staticmethod | ||
def _npz_data_fetch_fn(data_infos, idx): | ||
"""Fetch data from npz file by idx and package them to a dict. | ||
|
||
Args: | ||
data_infos (array, tuple, dict): Data infos in the npz file. | ||
idx (int): Index of current batch. | ||
|
||
Returns: | ||
dict: Data infos of the given idx. | ||
""" | ||
data_dict = dict() | ||
for k in data_infos.keys(): | ||
if data_infos[k][idx].shape == (): | ||
v = np.array([data_infos[k][idx]]) | ||
else: | ||
v = data_infos[k][idx] | ||
data_dict[k] = v | ||
return data_dict | ||
|
||
def prepare_data(self, idx, data_fetch_fn=None): | ||
"""Prepare data. | ||
|
||
Args: | ||
idx (int): Index of current batch. | ||
data_fetch_fn (callable): Function to fetch data. | ||
|
||
Returns: | ||
dict: Prepared training data batch. | ||
""" | ||
if data_fetch_fn is None: | ||
data = self.data_infos[idx] | ||
else: | ||
data = data_fetch_fn(self.data_infos, idx) | ||
return self.pipeline(data) | ||
|
||
def __len__(self): | ||
return self.data_length | ||
|
||
def __getitem__(self, idx): | ||
return self.prepare_data(idx, self.data_fetch_fn) | ||
|
||
def __repr__(self): | ||
dataset_name = self.__class__ | ||
file_path = self.file_path | ||
num_imgs = len(self) | ||
return (f'dataset_name: {dataset_name}, total {num_imgs} images in ' | ||
f'file_path: {file_path}') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaults to False