Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename any mentions to data set #1572

Merged
merged 7 commits into from
Oct 16, 2023
Merged

Conversation

merelcht
Copy link
Member

@merelcht merelcht commented Oct 12, 2023

Description

Part of kedro-org/kedro#2129

Resolves #1560

Development notes

This started out as just renaming any DataSet mentions to Dataset, but then I discovered "tracking" datasets weren't displayed properly in Kedro-Viz anymore.

There's a lot of hardcoded references to datasets in kedro-viz which is problematic and the renaming of datasets has surfaced that again. The way the renaming was done is backwards compatible, but not if strings are hardcoded.

Updating all dataset mentions to the new ending of Dataset seems to have fixed the problem.

Note: I've left the "old" dataset names in place in the demo project to make sure that the old naming still works as well.

QA notes

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

@ravi-kumar-pilla
Copy link
Contributor

Attaching kedro viz story created - This PR also resolves #1560

Copy link
Member

@tynandebold tynandebold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, so many changes. Thank you!

I still see a fair amount of DataSet mentions when I do a project-wide search:

image

Do we need to replace those? We should for at least the demo-project tracking datasets, otherwise the data doesn't show up in experiment tracking and instead is blank:

image

@merelcht
Copy link
Member Author

We should for at least the demo-project tracking datasets, otherwise the data doesn't show up in experiment tracking and instead is blank:

image

I deliberately didn't change these, because it should actually work without the rename.. When I run my branch locally on the project I see:

Screenshot 2023-10-13 at 13 19 49

@tynandebold
Copy link
Member

We should for at least the demo-project tracking datasets, otherwise the data doesn't show up in experiment tracking and instead is blank:
image

I deliberately didn't change these, because it should actually work without the rename.. When I run my branch locally on the project I see:

Screenshot 2023-10-13 at 13 19 49

Got it. I had to update kedro-datasets to the latest version. It didn't work on my previous one, which was version 1.5.1. I think that's the expected behavior?

@tynandebold tynandebold requested review from ravi-kumar-pilla and removed request for yetudada October 13, 2023 14:25
@ravi-kumar-pilla
Copy link
Contributor

Hi @merelcht , Thank you for this.

I see ImageDataSet has not changed its type to ImageDataset. Is there a list of left-over datasets that are yet to be changed ?
image

I see the warnings mention - CSVDataSet, ExcelDataSet, ParquetDataSet, PickleDataSet, PlotlyDataSet, JSONDataSet, MetricsDataSet (which might not be the entire list of changed datasets)

@merelcht
Copy link
Member Author

Got it. I had to update kedro-datasets to the latest version. It didn't work on my previous one, which was version 1.5.1. I think that's the expected behavior?

@tynandebold Yes that is the expected behaviour. We could also keep the references to both datasets with DataSet endings and Dataset ending, but kedro-datasets 1.7.1 is a non-breaking release so based on pins in our own requirements users should get this more up to date version. From the kedro/kedro-datasets side it doesn't matter, so I leave the choice up to you if you prefer to keep both references for now and only remove later.

@merelcht
Copy link
Member Author

I see ImageDataSet has not changed its type to ImageDataset. Is there a list of left-over datasets that are yet to be changed ? image

I see the warnings mention - CSVDataSet, ExcelDataSet, ParquetDataSet, PickleDataSet, PlotlyDataSet, JSONDataSet, MetricsDataSet (which might not be the entire list of changed datasets)

@ravi-kumar-pilla I haven't renamed anything inside the demo-project yet as a check that the old naming, when it's used in a Kedro project, still works with kedro-viz. If the team prefers it, I can rename that now as well or do it later when the old names are actually fully removed.

@ravi-kumar-pilla
Copy link
Contributor

Hi @merelcht , yes I understand that we haven't renamed any dataset types inside demo-project. But I was confused on why all the dataset types except ImageDataSet are being displayed with a lowercase [s] like below. So, I was asking if there are any pending datasets to be renamed in the kedro-datasets plugin -

image

@ravi-kumar-pilla
Copy link
Contributor

Also, there seems to be an issue with trackingDataset in experiment tracking when using kedro-datasets 1.7.1. I am yet to debug the issue but @merelcht do you think we should fix this in this PR or create a different issue for this ? Thank you

@merelcht
Copy link
Member Author

Also, there seems to be an issue with trackingDataset in experiment tracking when using kedro-datasets 1.7.1. I am yet to debug the issue but @merelcht do you think we should fix this in this PR or create a different issue for this ? Thank you

Yes this fixes it.

About the ImageDataSet, this is because this is a custom dataset in the demo project and I didn't want to rename anything in that project yet.

@merelcht merelcht merged commit cb62e7d into main Oct 16, 2023
20 checks passed
@merelcht merelcht deleted the rename-any-mentions-to-DataSet branch October 16, 2023 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Kedro Datasets 1.7.1 and above
3 participants