Skip to content

Commit

Permalink
Pretty printing catalog (#3990)
Browse files Browse the repository at this point in the history
* Implemented basic __repr__

Signed-off-by: Elena Khaustova <[email protected]>

* Updated __repr__

Signed-off-by: Elena Khaustova <[email protected]>

* Removed __str__

Signed-off-by: Elena Khaustova <[email protected]>

* Updated _describe() for CachedDataset

Signed-off-by: Elena Khaustova <[email protected]>

* Made pretty_repr protected

Signed-off-by: Elena Khaustova <[email protected]>

* Reverted width parameter to default

Signed-off-by: Elena Khaustova <[email protected]>

* Implemented repr for catalog

Signed-off-by: Elena Khaustova <[email protected]>

* Disable sorting

Signed-off-by: Elena Khaustova <[email protected]>

* Replace set with dict to keep original datasets order when printing

Signed-off-by: Elena Khaustova <[email protected]>

* Updated printing params

Signed-off-by: Elena Khaustova <[email protected]>

* Updated printing width

Signed-off-by: Elena Khaustova <[email protected]>

* Removed params_repr

Signed-off-by: Elena Khaustova <[email protected]>

* Disable sorting

Signed-off-by: Elena Khaustova <[email protected]>

* Disable sorting

Signed-off-by: Elena Khaustova <[email protected]>

* Disabled compact

Signed-off-by: Elena Khaustova <[email protected]>

* Updated test_str_representation

Signed-off-by: Elena Khaustova <[email protected]>

* Updated cached dataset tests

Signed-off-by: Elena Khaustova <[email protected]>

* Updated data catalog tests

Signed-off-by: Elena Khaustova <[email protected]>

* Updated core tests

Signed-off-by: Elena Khaustova <[email protected]>

* Updated versioned dataset tests

Signed-off-by: Elena Khaustova <[email protected]>

* Updated tests for lambda dataset

Signed-off-by: Elena Khaustova <[email protected]>

* Updated tests for memory dataset

Signed-off-by: Elena Khaustova <[email protected]>

* Updated release notes

Signed-off-by: Elena Khaustova <[email protected]>

* Set width to maxsize

Signed-off-by: Elena Khaustova <[email protected]>

* Removed top-level keys sorting

Signed-off-by: Elena Khaustova <[email protected]>

* Updated tests

Signed-off-by: Elena Khaustova <[email protected]>

* Updated release notes

Signed-off-by: Elena Khaustova <[email protected]>

* Decoupled describe from pretty printing

Signed-off-by: Elena Khaustova <[email protected]>

* Returned old __str__ to avoid a breaking change

Signed-off-by: Elena Khaustova <[email protected]>

* Updated tests

Signed-off-by: Elena Khaustova <[email protected]>

* Replaced deprecation comment with TODO

Signed-off-by: Elena Khaustova <[email protected]>

---------

Signed-off-by: Elena Khaustova <[email protected]>
  • Loading branch information
ElenaKhaustova authored Jul 18, 2024
1 parent f54f463 commit e2b20a4
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 3 deletions.
1 change: 1 addition & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* Implemented key completion support for accessing datasets in the `DataCatalog`.
* Made [kedro-telemetry](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-telemetry) a core dependency.
* Implemented dataset pretty printing.
* Implemented `DataCatalog` pretty printing.

## Breaking changes to the API

Expand Down
19 changes: 16 additions & 3 deletions kedro/io/data_catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import copy
import difflib
import logging
import pprint
import re
from typing import Any, Dict

Expand Down Expand Up @@ -106,7 +107,7 @@ def __init__(
"""Return a _FrozenDatasets instance from some datasets collections.
Each collection could either be another _FrozenDatasets or a dictionary.
"""
self._original_names: set[str] = set()
self._original_names: dict[str, str] = {}
for collection in datasets_collections:
if isinstance(collection, _FrozenDatasets):
self.__dict__.update(collection.__dict__)
Expand All @@ -116,7 +117,7 @@ def __init__(
# for easy access to transcoded/prefixed datasets.
for dataset_name, dataset in collection.items():
self.__dict__[_sub_nonword_chars(dataset_name)] = dataset
self._original_names.add(dataset_name)
self._original_names[dataset_name] = ""

# Don't allow users to add/change attributes on the fly
def __setattr__(self, key: str, value: Any) -> None:
Expand All @@ -131,11 +132,20 @@ def __setattr__(self, key: str, value: Any) -> None:
raise AttributeError(msg)

def _ipython_key_completions_(self) -> list[str]:
return list(self._original_names)
return list(self._original_names.keys())

def __getitem__(self, key: str) -> Any:
return self.__dict__[_sub_nonword_chars(key)]

def __repr__(self) -> str:
datasets_repr = {}
for ds_name in self._original_names.keys():
datasets_repr[ds_name] = self.__dict__[
_sub_nonword_chars(ds_name)
].__repr__()

return pprint.pformat(datasets_repr, sort_dicts=False)


class DataCatalog:
"""``DataCatalog`` stores instances of ``AbstractDataset`` implementations
Expand Down Expand Up @@ -207,6 +217,9 @@ def __init__( # noqa: PLR0913
if feed_dict:
self.add_feed_dict(feed_dict)

def __repr__(self) -> str:
return self.datasets.__repr__()

@property
def _logger(self) -> logging.Logger:
return logging.getLogger(__name__)
Expand Down

0 comments on commit e2b20a4

Please sign in to comment.