Skip to content
This repository has been archived by the owner on Oct 28, 2024. It is now read-only.

Make default field type to be any #13

Merged
merged 12 commits into from
Mar 28, 2024
53 changes: 46 additions & 7 deletions content/docs/specifications/table-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,15 +188,14 @@ A data consumer `MUST`:

### Types and Formats

`type` and `format` properties are used to give The type of the field (string, number etc) - see below for
more detail. If type is not provided a consumer `SHOULD` assume a type of "string".
roll marked this conversation as resolved.
Show resolved Hide resolved
`type` and `format` properties are used to give the type of the field (string, number, etc.) - see below for more detail. If type is not provided a consumer `MUST` utilize the `any` type for the field instead of inferring it from the field's values.

A field's `type` property is a string indicating the type of this field.

A field's `format` property is a string, indicating a format for the field type.

Both `type` and `format` are optional: in a field descriptor, the absence of a
`type` property indicates that the field is of the type "string", and the
`type` property indicates that the field is of the type "any", and the
absence of a `format` property indicates that the field's type `format` is
"default".

Expand Down Expand Up @@ -391,11 +390,48 @@ The field contains a JSON object according to GeoJSON or TopoJSON spec.

#### any

Any `type` or `format` is accepted. When converting from physical to logical representation, the behaviour `SHOULD` be similar to String field type.
The field contains values of a unspecified or mixed type. A data consumer `MUST NOT` perform any processing on this field's values and `MUST` interpret them as it is in the data source. This data type is directly modelled on the concept of the `any` type of strongly typed object-oriented languages like [TypeScript](https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#any).
roll marked this conversation as resolved.
Show resolved Hide resolved

[strptime]: https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior
[iso8601-duration]: https://en.wikipedia.org/wiki/ISO_8601#Durations
[xsd-duration]: http://www.w3.org/TR/xmlschema-2/#duration
For example, having a Table Schema below:

```json
{
"fields": [
{ "name": "id", "type": "any" },
{ "name": "name", "type": "any" }
]
}
```

This CSV data file will have logical values as below:

```csv
id,name
1,apple
2,orange
```

```javascript
{id: "1", name: "apple"}
{id: "2", name: "orange"}
```

While this JSON data file will have logical values as below:

```json
[
["id", "name"]
[1, "apple"]
[2, "orange"]
]
```

```javascript
{id: 1, name: "apple"}
{id: 2, name: "orange"}
```

Note, that for the CSV data source the `id` field is interpreted as a string because CSV supports only one data type i.e. string, and for the JSON data source the `id` field is interpreted as an integer because JSON supports a numeric data type and the value was declared as an integer. Also, for the Table Schema above a `type` property for each field can be omitted as it is a default field type.

### Rich Types

Expand Down Expand Up @@ -817,3 +853,6 @@ Table Schema draws content and/or inspiration from, among others, the following
[dspl]: https://developers.google.com/public-data/docs/schema/dspl18
[html5 forms]: http://www.whatwg.org/specs/web-apps/current-work/#attr-input-typ
[elasticsearch]: http://www.elasticsearch.org/guide/reference/mapping/
[strptime]: https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior
[iso8601-duration]: https://en.wikipedia.org/wiki/ISO_8601#Durations
[xsd-duration]: http://www.w3.org/TR/xmlschema-2/#duration