From 6d14837a1d58a070d9007c2310df7cea2530d3cf Mon Sep 17 00:00:00 2001 From: roll Date: Tue, 7 May 2024 09:02:19 +0100 Subject: [PATCH] Improved changelog structure (#60) * Present Table Schema changelog as a list * Move up headings and present Table Schema as headings * Update changelog structure for Table Schema (#59) * Present Table Schema changelog as a list * Move up headings and present Table Schema as headings * End language section with period * Use titlecase for author name * Correct tick for MUST NOT * Use h5 headers * Clarify and correct some Table Schema changes * Document Data Resource and Table Dialect changes * Correct description for givenName/familyName * Redocument Data Package * Order changes following the order they appear in the spec * Shorten titles * Bundle changes for contributors and contributors --------- Co-authored-by: Peter Desmet --- .../recipes/relationship-between-fields.md | 2 +- content/docs/specifications/data-package.md | 6 +- content/docs/specifications/data-resource.md | 2 +- content/docs/specifications/extensions.md | 2 +- content/docs/specifications/glossary.md | 4 +- content/docs/specifications/security.md | 2 +- content/docs/specifications/table-dialect.md | 2 +- content/docs/specifications/table-schema.md | 2 +- content/docs/standard/changelog.md | 153 +++++++----------- 9 files changed, 68 insertions(+), 107 deletions(-) diff --git a/content/docs/recipes/relationship-between-fields.md b/content/docs/recipes/relationship-between-fields.md index cfc854d1..41cbf22a 100644 --- a/content/docs/recipes/relationship-between-fields.md +++ b/content/docs/recipes/relationship-between-fields.md @@ -5,7 +5,7 @@ title: Relationship between Fields - +
AuthorsPhilippe THOMY, Peter DesmetPhilippe Thomy, Peter Desmet
diff --git a/content/docs/specifications/data-package.md b/content/docs/specifications/data-package.md index d95e8b00..4a20d022 100644 --- a/content/docs/specifications/data-package.md +++ b/content/docs/specifications/data-package.md @@ -19,7 +19,7 @@ A simple container format for describing a coherent collection of data in a sing ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Introduction @@ -221,8 +221,8 @@ An Array of string keywords to assist users searching for the package in catalog The people or organizations who contributed to this Data Package. It `MUST` be an array. Each entry is a Contributor and `MUST` be an `object`. A Contributor `MUST` have at least one property. A Contributor is `RECOMMENDED` to have `title` property and `MAY` contain `givenName`, `familyName`, `path`, `email`, `roles`, and `organization` properties: - `title`: A string containing a name of the contributor. -- `givenName`: A string containing name a person has been given, if the contributor is a person. -- `familyName`: A string containing familial name that a person inherits, if the contributor is a person. +- `givenName`: A string containing the name a person has been given, if the contributor is a person. +- `familyName`: A string containing the familial name that a person inherits, if the contributor is a person. - `path`: A fully qualified URL pointing to a relevant location online for the contributor. - `email`: A string containing an email address. - `roles`: An array of strings describing the roles of the contributor. A role is `RECOMMENDED` to follow an established vocabulary, such as [DataCite Metadata Schema's contributorRole](https://support.datacite.org/docs/datacite-metadata-schema-v44-recommended-and-optional-properties#7a-contributortype) or [CreDIT](https://credit.niso.org/). Useful roles to indicate are: `creator`, `contact`, `rightsHolder`, and `dataCurator`. diff --git a/content/docs/specifications/data-resource.md b/content/docs/specifications/data-resource.md index 87086287..1fabd71d 100644 --- a/content/docs/specifications/data-resource.md +++ b/content/docs/specifications/data-resource.md @@ -19,7 +19,7 @@ A simple format to describe and package a single data resource such as a individ ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Descriptor diff --git a/content/docs/specifications/extensions.md b/content/docs/specifications/extensions.md index b85851fa..e2c50fe0 100644 --- a/content/docs/specifications/extensions.md +++ b/content/docs/specifications/extensions.md @@ -15,7 +15,7 @@ The Data Package Standard extensibility features for domain-specific needs. ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Introduction diff --git a/content/docs/specifications/glossary.md b/content/docs/specifications/glossary.md index 92f07351..160ae192 100644 --- a/content/docs/specifications/glossary.md +++ b/content/docs/specifications/glossary.md @@ -15,7 +15,7 @@ A dictionary of special terms for the Data Package Standard. ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Definitions @@ -74,7 +74,7 @@ A `URL or Path` is a `string` with the following additional constraints: - `MUST` either be a URL or a POSIX path - [URLs](https://en.wikipedia.org/wiki/Uniform_Resource_Locator) `MUST` be fully qualified. `MUST` be using either http or https scheme. (Absence of a scheme indicates `MUST` be a POSIX path) -- [POSIX paths](https://en.wikipedia.org/wiki/Path_%28computing%29#POSIX_pathname_definition) (unix-style with `/` as separator) are supported for referencing local files, with the security restraint that they `MUST` be relative siblings or children of the descriptor. Absolute paths `/`, relative parent paths `../`, hidden folders starting from a dot `.hidden` `MUST` NOT be used. +- [POSIX paths](https://en.wikipedia.org/wiki/Path_%28computing%29#POSIX_pathname_definition) (unix-style with `/` as separator) are supported for referencing local files, with the security restraint that they `MUST` be relative siblings or children of the descriptor. Absolute paths `/`, relative parent paths `../`, hidden folders starting from a dot `.hidden` `MUST NOT` be used. Example of a fully qualified url: diff --git a/content/docs/specifications/security.md b/content/docs/specifications/security.md index 6f9a429a..6d6007aa 100644 --- a/content/docs/specifications/security.md +++ b/content/docs/specifications/security.md @@ -15,7 +15,7 @@ Security considerations around Data Packages and Data Resources. ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Usage Perspective diff --git a/content/docs/specifications/table-dialect.md b/content/docs/specifications/table-dialect.md index 20250ae7..f989680e 100644 --- a/content/docs/specifications/table-dialect.md +++ b/content/docs/specifications/table-dialect.md @@ -19,7 +19,7 @@ Table Dialect describes how tabular data is stored in a file. It supports delimi ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Introduction diff --git a/content/docs/specifications/table-schema.md b/content/docs/specifications/table-schema.md index 5cdec350..75f83dc0 100644 --- a/content/docs/specifications/table-schema.md +++ b/content/docs/specifications/table-schema.md @@ -19,7 +19,7 @@ A simple format to declare a schema for tabular data. The schema is designed to ## Language -The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Introduction diff --git a/content/docs/standard/changelog.md b/content/docs/standard/changelog.md index 7273b749..cc976e26 100644 --- a/content/docs/standard/changelog.md +++ b/content/docs/standard/changelog.md @@ -4,7 +4,7 @@ sidebar: order: 10 --- -This document includes all meaningful changes made to the **specifications** consisting the Data Package Standard. It does not track changes made to other documents like recipes or guides. +This document includes all meaningful changes made to the Data Package Standard **specifications**. It does not cover changes made to other documents like Recipes or Guides. ## v2.0-draft @@ -14,152 +14,113 @@ This document includes all meaningful changes made to the **specifications** con The Data Package (v2) draft release includes a rich set of the specification improvements accepted by the Data Package Working Group during the active phase of the Data Package (v2) work. -### Changes +### Data Package -#### Specifications +##### `version` (updated) -##### Added `source.version` property +[`version`](../../specifications/data-package/#version) is now included in the specification, while in Data Package v1 it was erroneously only part of the documentation ([#3](https://github.com/frictionlessdata/datapackage/pull/3)). -This change adds a new property to make possible of providing information about source version. Please read more about [`source.version`](../../specifications/data-package/#sources) property. +##### `contributors` (updated) -> [Pull Request -- #10](https://github.com/frictionlessdata/datapackage/pull/10) +[`contributors`](../../specifications/data-package/#contributors) was updated: -##### Made `contributor/source.title` not required +- `contributor.title` is no longer required ([#7](https://github.com/frictionlessdata/datapackage/pull/7)). +- `contributor.givenName` and `contributor.familyName` are new properties to specify the given and family name of contributor, if it is a person ([#20](https://github.com/frictionlessdata/datapackage/pull/20)). +- `contributor.role` has been deprecated in favour of `contributor.roles`, see further ([#18](https://github.com/frictionlessdata/datapackage/pull/18)). +- `contributor.roles` is a new property that allows to specify multiple roles per contributor, rather than having to duplicate the contributor. It recommendeds to follow an established vocabulary and has suggested values that are different from the deprecated `contributor.role` ([#18](https://github.com/frictionlessdata/datapackage/pull/18)). -This change allows omitting `title` property for the `contributor` and `source` objects making it more flexible for data producers. +##### `sources` (updated) -> [Pull Request -- #7](https://github.com/frictionlessdata/datapackage/pull/7) +[`sources`](../../specifications/data-package/#sources) was updated: -#### Data Package +- `source.title` is no longer required ([#7](https://github.com/frictionlessdata/datapackage/pull/7)). +- `source.version` is a new property to specify which version of a source was used ([#10](https://github.com/frictionlessdata/datapackage/pull/10)). -##### Added `contributor.given/familyName` +### Data Resource -This change adds two new properties to the `contributor` object: `givenName` and `familyName`. Please read more about [`package.contributors`](../../specifications/data-package/#contributors) property. +##### `name` (updated) -> [Pull Request -- #20](https://github.com/frictionlessdata/datapackage/pull/20) +[name](../../specifications/data-resource/#name-required) now allows any string. It previously required the name to only consist of lowercase alphanumeric characters plus `.`, `-` and `_`. The property is still required and must be unique among resources ([#27](https://github.com/frictionlessdata/datapackage/pull/27)). -##### Added `contributor.roles` property +##### `path` (updated) -This change adds a new `contributors.roles` property that replaces `contributor.role`. Please read more about [`package.contributors`](../../specifications/data-package/#contributors) property. +[path](../../specifications/data-resource/#path-or-data-required) now explicitely forbids hidden folders (starting with dot `.`) ([#19](https://github.com/frictionlessdata/datapackage/pull/19)). -> [Pull Request -- #18](https://github.com/frictionlessdata/datapackage/pull/18) +##### `encoding` (updated) -##### Fixed `version` property in Data Package profile +[encoding](../../specifications/data-resource/#encoding)'s definition has been updated to support binary formats like Parquet ([#15](https://github.com/frictionlessdata/datapackage/pull/15)). -This change adds omitted `version` property to the Data Package profiles. +### Table Dialect -> [Pull Request -- #3](https://github.com/frictionlessdata/datapackage/pull/3) +[Table Dialect](../../specifications/table-dialect) is a new specification that superseeds and extends the CSV Dialect specification. It support other formats like JSON or Excel ([#41](https://github.com/frictionlessdata/datapackage/pull/41)). -#### Data Resource +### Table Schema -##### Relaxed `resource.name` rules but keep it required and unique +#### Schema -This change relaxes requirements to `resource.name` allowing it to be any string. This property still needs to present and be unique among resources. Please read more about [`resource.name`](../../specifications/data-resource/#name-required) property. +##### `fieldsMatch` (new) -> [Pull Request -- #27](https://github.com/frictionlessdata/datapackage/pull/27) +[fieldsMatch](../../specifications/table-schema/#fieldsmatch) allows to specify how fields in a Table Schema match the fields in the data source. The default (`exact`) matches the Data Package v1 behaviour, but other values (e.g. `subset`, `superset`) allow to define fewer or more fields and match on field names. This new property extends and makes explicit the `schema_sync` option in Frictionless Framework ([#39](https://github.com/frictionlessdata/datapackage/pull/39)). -##### Clarified `resource.encoding` property +##### `primaryKey` (updated) -This change updates the `resource.encoding` property definition to properly support binary file formats like Parquet. Please read more about [`resource.encoding`](../../specifications/data-resource/#encoding) property. +[`primaryKey`](../../specifications/table-schema/#primarykey) should now always be an array of strings, not a string ([#28](https://github.com/frictionlessdata/datapackage/pull/28)). -> [Pull Request -- #15](https://github.com/frictionlessdata/datapackage/pull/15) +##### `uniqueKeys` (new) -##### Forbade hidden folders in paths +[`uniqueKeys`](../../specifications/table-schema/#uniquekeys) allows to specify which fields are required to have unique logical values. It is an alternative to `field.contraints.unique` and is modelled after the corresponding SQL feature ([#30](https://github.com/frictionlessdata/datapackage/pull/30)). -This change fixes definition in the Data Resource specification to explicitly forbid hidden folders. +##### `foreignKeys` (updated) -> [Pull Request -- #19](https://github.com/frictionlessdata/datapackage/pull/19) +[`foreignKeys`](../../specifications/table-schema/#foreignkeys) was updated: -#### Table Dialect +- It should now always be an array of strings, not a string ([#28](https://github.com/frictionlessdata/datapackage/pull/28)). +- `foreignKeys.reference.resource` can now be omitted for self-referencing foreign keys. Previously it required setting `resource` to an empty string ([#29](https://github.com/frictionlessdata/datapackage/pull/29)). -##### First version of the specification +#### Fields -This change adds a new specification Table Dialect that superseeds and extends the CSV Dialect specification to work with other formats like JSON or Excel. Please refer to the [Table Dialect](../../specifications/table-dialect) specification. +##### `missingValues` (new) -> [Pull Request -- #41](https://github.com/frictionlessdata/datapackage/pull/41) +[`missingValues`](../../specifications/table-schema/#missingvalues) allows to specify missing values per field, and overwrites `missingValues` specified at a resource level ([#24](https://github.com/frictionlessdata/datapackage/pull/24)). -#### Table Schema +#### Field Types -##### Added `schema.fieldsMatch` property +##### `integer` (updated) -This change clarifies the default field matching behaviour and adds new modes for matching data source and Table Schema fields. Please read more about [`schema.fieldsMatch`](../../specifications/table-schema/#fieldsmatch) property. +[`integer`](../../specifications/table-schema/#integer) now has a `groupChar` property. It was already available for `number` ([#6](https://github.com/frictionlessdata/datapackage/pull/6)). -> [Pull Request -- #39](https://github.com/frictionlessdata/datapackage/pull/39) +##### `list` (new) -##### Made `any` be a default field type +[`list`](../../specifications/table-schema/#list) allows to specify fields containing collections of primary values separated by a delimiter (e.g. `value1,value2`) ([#38](https://github.com/frictionlessdata/datapackage/pull/38)). -This change makes field type to be `any` by default and ensures that the field type is not inferred if not provided. Please read more about [`any`](../../specifications/table-schema/#any) type. +##### `datetime` (updated) -> [Pull Request -- #13](https://github.com/frictionlessdata/datapackage/pull/13) +[`datetime`](../../specifications/table-schema/#datetime)'s default `format` is now extended to allow optional milliseconds and timezone parts ([#23](https://github.com/frictionlessdata/datapackage/pull/23)). -##### Added `uniqueKeys` property +##### `geopoint` (updated) -This change adds `uniqueKeys` property directly modelled after corresponding SQL feature. Please read more about [`schema.uniqueKeys`](../../specifications/table-schema/#uniquekeys) property. +[`geopoint`](../../specifications/table-schema/#geopoint)'s definition now clarifies that floating point numbers can be used for coordinate definitions ([#14](https://github.com/frictionlessdata/datapackage/pull/14)). -> [Pull Request -- #30](https://github.com/frictionlessdata/datapackage/pull/30) +##### `any` (updated) -##### Added `field.missingValues` +[`any`](../../specifications/table-schema/#any) is now the default field type and clarifies that the field type should not be inferred if not provided ([#13](https://github.com/frictionlessdata/datapackage/pull/13)). -This change adds a property that allows to specify missing values individually per field. Please read more about [`field.missingValues`](../../specifications/table-schema/#missingvalues) property. +#### Field Constraints -> [Pull Request -- #24](https://github.com/frictionlessdata/datapackage/pull/24) +##### `minimum` and `maximum` (updated) -##### Added `list` field type +[`minimum`](../../specifications/table-schema/#minimum) and [`maximum`](../../specifications/table-schema/#maximum) are now extended to support the `duration` field type ([#8](https://github.com/frictionlessdata/datapackage/pull/8)). -This change adds a new field type `list` for typed collections, lexically delimiter-based. Please read more about [`list`](../../specifications/table-schema/#list) type. +##### `exclusiveMinimum` and `exclusiveMaximum` (new) -> [Pull Request -- #38](https://github.com/frictionlessdata/datapackage/pull/38) +[`exclusiveMinimum`](../../specifications/table-schema/#exclusiveminimum) and [`exclusiveMaximum`](../../specifications/table-schema/#exclusivemaximum) can be used to specify exclusive minimum and maximum values ([#11](https://github.com/frictionlessdata/datapackage/pull/11)). -##### Added `jsonSchema` constraint to object and array fields +##### `jsonschema` (new) -This change adds a new constraint for the `object` and `array` fields. Please read more about [`constraints.jsonSchema`](../../specifications/table-schema/#jsonschema) constraint. - -> [Pull Request -- #32](https://github.com/frictionlessdata/datapackage/pull/32) - -##### Support `groupChar` for integer field type - -This change adds support for providing integers with group chars. Please read more about [`field.groupChar`](../../specifications/table-schema/#integer) property. - -> [Pull Request -- #6](https://github.com/frictionlessdata/datapackage/pull/6) - -##### Extended `datetime` default format - -This change extends `default` format definition for the `datetime` field type allowing to provide optional milliseconds and timezone parts. - -> [Pull Request -- #23](https://github.com/frictionlessdata/datapackage/pull/23) - -##### Supported exclusive constraints - -This change adds new `exclusiveMinimum` and `exclusiveMaximum` constraints to the Table Schema specification. - -> [Pull Request -- #11](https://github.com/frictionlessdata/datapackage/pull/11) - -##### Simplified self-referencing in foreign keys - -This change allows omitting `foreignKey.resource.reference` in case of self-referencing. Previously it required setting resource to an empty string. - -> [Pull Request -- #29](https://github.com/frictionlessdata/datapackage/pull/29) - -##### Discouraged usage of unnecessary union types - -This change discourages usage of mixed types for `schema.primaryKeys` and `schema.foreignKeys.fields` properties. - -> [Pull Request -- #28](https://github.com/frictionlessdata/datapackage/pull/28) - -##### Clarified that `geopoint` is number-based - -This changes clarifies that `geopoint` field type can use floating point numbers for coordinate definitions. - -> [Pull Request -- #14](https://github.com/frictionlessdata/datapackage/pull/14) - -##### Fixed duration constraint - -This change fixes `minimum` and `maximum` constraint for the `duration` field type. - -> [Pull Request -- #8](https://github.com/frictionlessdata/datapackage/pull/8) +[`jsonSchema`](../../specifications/table-schema/#jsonschema) can be used for the `object` and `array` field types ([#32](https://github.com/frictionlessdata/datapackage/pull/32)). ## v1.0 > September 5, 2017 -Please refer to the the [Data Package (v1) website](https://specs.frictionlessdata.io/). +Please refer to the [Data Package (v1) website](https://specs.frictionlessdata.io/).