-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataCatalog2.0]: Move pattern resolution logic to the separate component #4123
Conversation
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
for ds_name in datasets: | ||
is_param = ds_name.startswith("params:") or ds_name == "parameters" | ||
if ds_name in explicit_datasets or is_param: | ||
for ds_name in pipeline_datasets: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of continue
since it harms the flow of thinking while reading the code. Could we do this instead:
for ds_name in (pipeline_datasets - explicit_datasets - filter(is_parameter, pipeline_datasets)):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will have to do this since explicit_datasets
is a dictionary:
for ds_name in (pipeline_datasets - set(explicit_datasets.keys()) - filter(is_parameter, pipeline_datasets)):
and for me, it looks way more complex to understand than
for ds_name in pipeline_datasets:
if ds_name in explicit_datasets or is_parameter(ds_name):
continue
It's also slightly worse from the complexity point of view though I agree both cost nothing.
Curious what other think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good @ElenaKhaustova ! I love the separation of dataset factoreis resolution logic from the catalog ⭐
I left some comments and questions, mostly around naming/positioning.
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments @ElenaKhaustova, I don't have any blocking remarks anymore. Don't forget to add this change to the release notes!
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
…void breaking change Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @ElenaKhaustova ⭐ 👏🏾 💯
Left a small comment but looks good!
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that there are no changes in DataCatalog.from_config
, only in DataCatalog.__init__
(rarely used in interactive contexts) and so my understanding is that this has zero user impact 👍🏼
self._dataset_patterns = dataset_patterns or {} | ||
self._config_resolver = config_resolver or CatalogConfigResolver() | ||
|
||
# Kept to avoid breaking changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Description
Solves #4110
Relates to #3925
Please see the suggested order of work in this comment
Development notes
This PR includes the following:
CatalogConfigResolver
DataCatalog
to useCatalogConfigResolver
internallykedro run
command andkedro catalog
commands work after the catalog updatesDataCatalog
interface remains the same, so no breaking changes introducedDeveloper Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file