Skip to content

Commit

Permalink
Move kedro-catalog JSON schema to kedro-datasets (#4359)
Browse files Browse the repository at this point in the history
* Move `kedro-catalog` JSON schema to `kedro-datasets` #4258

Signed-off-by: Chris Schopp <[email protected]>

* Add description of change to `RELEASE.md`

Signed-off-by: Chris Schopp <[email protected]>

* Update refs to `jsonschema` to use `kedro-plugins/kedro-datasets/`

Signed-off-by: GitHub <[email protected]>

* Update ignore-names.txt

Signed-off-by: Juan Luis Cano Rodríguez <[email protected]>

* Update ignore.txt

Signed-off-by: Juan Luis Cano Rodríguez <[email protected]>

* Keep jsonschemas for CachedDataset, MemoryDataset, and LambdaDataset

* These datasets remain in Kedro and were not moved to kedro-datasets

Signed-off-by: GitHub <[email protected]>

* Fix linter

Signed-off-by: Merel Theisen <[email protected]>

---------

Signed-off-by: Chris Schopp <[email protected]>
Signed-off-by: GitHub <[email protected]>
Signed-off-by: Juan Luis Cano Rodríguez <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Co-authored-by: Merel Theisen <[email protected]>
Co-authored-by: Juan Luis Cano Rodríguez <[email protected]>
Co-authored-by: Merel Theisen <[email protected]>
  • Loading branch information
4 people authored Jan 22, 2025
1 parent 9ee181f commit 46259b9
Show file tree
Hide file tree
Showing 12 changed files with 112 additions and 5,813 deletions.
1 change: 1 addition & 0 deletions .github/styles/Kedro/ignore-names.txt
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ Puneet
Rashida
Ravi
Richard
Schopp
Schwarzmann
Sorokin
Stichbury
Expand Down
1 change: 1 addition & 0 deletions .github/styles/Kedro/ignore.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
Kedro
Kedro's
Kedroids
Kubeflow
Databricks
Conda
Expand Down
4 changes: 4 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,17 @@
* Safeguard hooks when user incorrectly registers a hook class in settings.py.
* Fixed parsing paths with query and fragment.
* Remove lowercase transformation in regex validation.
* Moved `kedro-catalog` JSON schema to `kedro-datasets`.
* Updated `Partitioned dataset lazy saving` docs page.


## Breaking changes to the API
## Documentation changes

## Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:
* [Hendrik Scherner](https://github.com/SchernHe)
* [Chris Schopp](https://github.com/chrisschopp)

# Release 0.19.10

Expand Down
2 changes: 1 addition & 1 deletion docs/source/data/how_to_create_a_custom_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -603,6 +603,6 @@ kedro-plugins/kedro-datasets/kedro_datasets/image
There are two special considerations when contributing a dataset:
1. Add the dataset to `kedro_datasets.rst` so it shows up in the API documentation.
2. Add the dataset to `static/jsonschema/kedro-catalog-X.json` for IDE validation.
2. Add the dataset to `kedro-plugins/kedro-datasets/static/jsonschema/kedro-catalog-X.json` for IDE validation.
```
4 changes: 2 additions & 2 deletions docs/source/development/set_up_pycharm.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,10 +163,10 @@ You can enable the Kedro catalog validation schema in your PyCharm IDE to enable

![](../meta/images/pycharm_edit_schema_mapping.png)

Add a new mapping using the "+" button in the top left of the window and select the name you want for it. Enter this URL `https://raw.githubusercontent.com/kedro-org/kedro/develop/static/jsonschema/kedro-catalog-0.19.json` in the "Schema URL" field and select "JSON Schema Version 7" in the "Schema version" field.
Add a new mapping using the "+" button in the top left of the window and select the name you want for it. Enter this URL `https://raw.githubusercontent.com/kedro-org/kedro-plugins/main/kedro-datasets/static/jsonschema/kedro-catalog-0.19.json` in the "Schema URL" field and select "JSON Schema Version 7" in the "Schema version" field.

Add the following file path pattern to the mapping: `conf/**/*catalog*`.

![](../meta/images/pycharm_catalog_schema_mapping.png)

> [Different schemas for different Kedro versions can be found in the Kedro repository](https://github.com/kedro-org/kedro/tree/main/static/jsonschema).
> [Different schemas for different Kedro versions can be found in the `kedro-datasets` repository](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/static/jsonschema).
4 changes: 2 additions & 2 deletions docs/source/development/set_up_vscode.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,11 +260,11 @@ Enter the following in your `settings.json` file:
```json
{
"yaml.schemas": {
"https://raw.githubusercontent.com/kedro-org/kedro/develop/static/jsonschema/kedro-catalog-0.19.json": "conf/**/*catalog*"
"https://raw.githubusercontent.com/kedro-org/kedro-plugins/main/kedro-datasets/static/jsonschema/kedro-catalog-0.19.json": "conf/**/*catalog*"
}
}
```

and start editing your `catalog` files.

> [Different schemas for different Kedro versions can be found in the Kedro repository](https://github.com/kedro-org/kedro/tree/main/static/jsonschema).
> [Different schemas for different Kedro versions can be found in the Kedro repository](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/static/jsonschema).
101 changes: 101 additions & 0 deletions static/img/kedro-catalog-0.19.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
{
"type": "object",
"patternProperties": {
"^[a-z0-9-_]+$": {
"required": [
"type"
],
"properties": {
"type": {
"type": "string",
"enum": [
"CachedDataset",
"MemoryDataset",
"LambdaDataset"
]
}
},
"allOf": [
{
"if": {
"properties": {
"type": {
"const": "CachedDataset"
}
}
},
"then": {
"required": [
"dataset"
],
"properties": {
"dataset": {
"pattern": ".*",
"description": "A Kedro Dataset object or a dictionary to cache."
},
"copy_mode": {
"type": "string",
"description": "The copy mode used to copy the data. Possible\nvalues are: \"deepcopy\", \"copy\" and \"assign\". If not\nprovided, it is inferred based on the data type."
}
}
}
},
{
"if": {
"properties": {
"type": {
"const": "MemoryDataset"
}
}
},
"then": {
"required": [],
"properties": {
"data": {
"pattern": ".*",
"description": "Python object containing the data."
},
"copy_mode": {
"type": "string",
"description": "The copy mode used to copy the data. Possible\nvalues are: \"deepcopy\", \"copy\" and \"assign\". If not\nprovided, it is inferred based on the data type."
}
}
}
},
{
"if": {
"properties": {
"type": {
"const": "LambdaDataset"
}
}
},
"then": {
"required": [
"load",
"save"
],
"properties": {
"load": {
"pattern": ".*",
"description": "Method to load data from a data set."
},
"save": {
"pattern": ".*",
"description": "Method to save data to a data set."
},
"exists": {
"pattern": ".*",
"description": "Method to check whether output data already exists."
},
"release": {
"pattern": ".*",
"description": "Method to release any cached information."
}
}
}
}
]
}
}
}
Loading

0 comments on commit 46259b9

Please sign in to comment.