Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset shape property and debuggers #35

Open
clegaard opened this issue Apr 13, 2020 · 0 comments
Open

Dataset shape property and debuggers #35

clegaard opened this issue Apr 13, 2020 · 0 comments
Assignees
Labels
behaviour Should this behaviour be changed?
Milestone

Comments

@clegaard
Copy link
Collaborator

Currently, the shape property of a dataset is determined by loading a single sample from the dataset.
This has the unintended effects when the dataset is inspected by a debugger like that in vscode, which evaluates the expression, which may potentially take several seconds if each sample is large.

@property
def shape(self) -> Sequence[Shape]:
"""Get the shape of a dataset item.
Returns:
Sequence[int] -- Item shapes
"""
if len(self) == 0:
return _DEFAULT_SHAPE
item = self.__getitem__(0)
if hasattr(item, "__getitem__"):
item_shape = []
for i in item:
if hasattr(i, "shape"): # numpy arrays
item_shape.append(i.shape)
elif hasattr(i, "size"): # PIL.Image.Image
item_shape.append(np.array(i).shape)
else:
item_shape.append(_DEFAULT_SHAPE)
return tuple(item_shape)
return _DEFAULT_SHAPE

This begs the question of whether or not properties should have side effects? I relation to the subsampling operator, this messes a caching mechanism. If logging was enabled this would potentially cause unexpected log messages to be printed.

A solution could be caching the inferred shape, e.g. saving it to a private attribute _shape and having the property link to that value instead?

@clegaard clegaard added the behaviour Should this behaviour be changed? label Apr 13, 2020
@LukasHedegaard LukasHedegaard added this to the 0.1.0 milestone Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
behaviour Should this behaviour be changed?
Projects
None yet
Development

No branches or pull requests

2 participants