Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: rename bbox parameters #22

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 67 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Generate RDF out of a GeoPackage (for further processing)
Install using NPM locally `npm install --global @rdmr-eu/rdf-geopackage` as a command line tool.

Check if it's installed correctly with `rdf-geopackage --help`.
That should return the following help info.
That should return the following help info:

```man
Generate RDF from an OGC GeoPackage with rdf-geopackage
Expand All @@ -19,76 +19,101 @@ Options:
--version Show version number [boolean]
-i, --input GeoPackage file [string]
-o, --output Output quads file [string]
--bounding-box Limit features to bounding box [string]
--bounding-box-crs Coordinate Reference System code [string]
--only-layers Only output named feature layers and attribute ta
bles [array]
--base-iri Base IRI [string]
--format Override output format (default: nquads)
[choices: "nq", "nquads", "trig", "nt", "ntriples", "ttl", "turtle"]
--bbox Limit features to bounding box [string]
--bbox-crs Coordinate Reference System code [string]
--only-layers Only output named feature layers and attribute ta
bles [array]
--include-binary-values Output binary values [boolean]
--base-iri Base IRI [string]
--model Data meta model [choices: "facade-x"]
```

## Options

Limit **large GeoPackages** with `--bounding-box`.
Supply a space separated list of coordinates as a string to limit the Features returned.
Provide the bounding box as WGS84 (GeoJSON default) or supply a CRS code (lookup via EPSG.io) or Projection WKT with `--bounding-box-crs`.
Basic input and output serializations can be set with the following options:

- `--input`: the path to the input GeoPackage file (required). With `-`, it reads the GeoPackage from stdin, e.g., piping a file with curl
- `--output`: path to the file output. By default, `rdf-geopackage` outputs _nquads_ to stdout. Its extension sets the serialization format, optionally with `.gz` to GZip the output. E.g., `--output myfile.ttl.gz`
- `--format`: set the output format explicitly. Provide a file extension with `.gz` to GZip the output.

You can also **limit** which feature **layers** (or attribute tables) are output with `--only-layers`.
**NULL values** are never output and **binary values** are skipped, unless `--include-binary-values` is provided.
Binary values are Base64-encoded string values with a `xsd:base64Binary` datatype.
Work with large GeoPackages by limiting the output features, output tables and binary values:

By default, **output** is directed to stdout as N-Quads. Provide `--output` to save the triples or quads to a file.
The **serialization format** is recognized from the file extension but can be overriden with `--format`.
Add `.gz` after the extension (e.g. `mydata.ttls.gz`) to **GZip** the output.
- `--bounding-box` limits the the output features to those in this area (default CRS: WGS84)
- `--bounding-box-crs` indicates the CRS for the aforementioned bounding box. Supply a EPSG code (web lookup with EPSG.io) or a projection WKT.
- `--only-layers` limits which feature layers (or attribute tables!) are output.
- `--include-binary-values` overrides the default of skipping binary values. These will be base64 encoded string values with a `^^xsd:base64Binary` data type. NULL values are never output.

Provide the path to the **input file** with `--input`.
You may also pipe in a file to rdf-geopackage.
Modify the model and types of the output triples or quads:

The generated quads follow a **data meta-model**, supplied by `--model` and by default `facade-x` with GeoSPARQL.
Override the **base IRI** with `--base-iri` to let subject-URLs not be derived from the present working directory.
- `--base-iri`: set the relative base for the output RDF data. By default, this value is derived from the present working directory.
- `--model`: the GeoPackage tables are not natively RDF data, so a module is programmed to generating triples according to a data meta-model. Included modules:
- default: [`facade-x`](#model-facade-x)

## Model: Facade-X
## RDF output

Facade-X is a data meta-model from the SPARQL-Anything project, that can represent tabular data easily.
The built-in data meta-model `facade-x` extends the tabular representation with [GeoSPARQL][geosparql] for geographical information from feature tables.
#### Model: Facade-X

Facade-X is a data meta-model from the SPARQL-Anything project, that can easily represent tabular data.
Facade-X uses RDF containers and blank nodes to represent tables and rows.
Features are `geo:Feature`s with a `geo:hasDefaultGeometry` that refers to a `geo:Geometry`.
That Geometry in turn has a `geo:asGeoJSON` and `geo:asWKT` representations of their geometry in WGS84 (GeoJSON-default).
Column metadata is currently very limited ([GH-24]) and many values are not typed properly.

[GH-24]: https://github.com/redmer/rdf-geopackage/issues/24

#### Features, geometries and CRS’s

Features and their geometries are represented using [GeoSPARQL][geosparql].
Only rows from feature tables are a `geo:Feature`.

A feature has zero or more geometries predicated with `geo:hasDefaultGeometry`.
There might be no geometry if the underlying library does not support the geometry type.
There may be multiple geometries if the feature is from a layer not in EPSG:4326.

That's because a GeoJSON serialization (`geo:asGeoJSON`) is always (reprojected) in EPSG:4326.
A `geo:Geometry` can be in only one CRS, meaning that when the feature is not originally in EPSG:4326, other serializations should also be reprojected.
That is undesirable, so in these cases, `rdf-geopackage` generates a second `geo:Geometry` for the WKT serialization (`geo:asWKT`).

[geosparql]: https://www.ogc.org/standard/geosparql/

#### Example RDF output

Column metadata is very limited and most values are not typed properly.
Example data abridged [from NGA][example.gpkg]:
the table `media`is a feature table, `nga_properties` is an attribute table.
the table `media` is a feature table, `nga_properties` is an attribute table.

[example.gpkg]: https://github.com/ngageoint/GeoPackage/blob/master/docs/examples/java/example.gpkg

```turtle
prefix fx: <http://sparql.xyz/facade-x/ns/>
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix xyz: <http://sparql.xyz/facade-x/data/>

xyz:nga_properties { # representing a table
xyz:nga_properties a fx:root ; # representing a table
rdf:_1 [ # the first row
xyz:id 14;
xyz:property "subject";
xyz:value "Examples"
] .
}

```trig
xyz:media {
xyz:media a fx:root ;
rdf:_1 [
a geo:Feature ;
a geo:Feature ; # a row from a feature table
xyz:text "BIT Systems";
xyz:date "2023-01-23";
geo:hasDefaultGeometry [
geo:hasDefaultGeometry [ # single geometry as CRS is EPSG:4326
a geo:Geometry ;
geo:asWKT "POINT (-104.801918 39.720014)"^^geo:wktLiteral
geo:asWKT "POINT (-104.801918 39.720014)"^^geo:wktLiteral ;
geo:asGeoJSON "{\"coordinates\":[-104.801918,39.720014],\"type\":\"Point\"}"^^geo:geoJSONLiteral
]
] .
}

xyz:nga_properties {
xyz:nga_properties a fx:root ;
rdf:_1 [
xyz:id 14;
xyz:property "subject";
xyz:value "Examples"
] .
}
```

[geosparql]: https://www.ogc.org/standard/geosparql/
[example.gpkg]: https://github.com/ngageoint/GeoPackage/blob/master/docs/examples/java/example.gpkg

# Acknowledgements

This tool was developed for a project funded by the [_City Deal Openbare ruimte_][cdor],
Expand Down
38 changes: 27 additions & 11 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@rdmr-eu/rdf-geopackage",
"version": "1.2.1",
"version": "1.3.0",
"description": "Generate RDF out of a GeoPackage (for further processing)",
"repository": "https://github.com/redmer/rdf-geopackage.git",
"main": "dist/rdf-geopackage.js",
Expand Down Expand Up @@ -69,14 +69,15 @@
"dependencies": {
"@ngageoint/geopackage": "^4.2.4",
"better-sqlite3": "^8.7.0",
"geojson": "^0.5.0",
"json-stable-stringify": "^1.0.2",
"n3": "^1.17.1",
"node-fetch": "^3.3.2",
"proj4": "^2.9.0",
"rdf-data-factory": "^1.1.2",
"rdf-literal": "^1.3.1",
"reproject": "^1.2.7",
"supports-color": "^9.4.0",
"wkx": "^0.5.0",
"yargs": "^17.7.2"
},
"jest": {
Expand Down
61 changes: 51 additions & 10 deletions src/bounding-box.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,20 +42,61 @@ export async function getWGS84Converter(
}
}

export function suppliedBoundingBox(
function spaceSepBbox(
bbstring: string,
srs: proj4.Converter | string,
): BoundingBox {
const [west, east, south, north] = bbstring
.split(" ", 4)
.map((c) => Number(c));
const bb = new BoundingBox(west, east, south, north);
return bb.projectBoundingBox(srs, WGS84_CODE);
}

function commaSepBbox(
bbstring: string,
inCRS: proj4.Converter | string,
) {
srs: proj4.Converter | string,
): BoundingBox {
const parts = bbstring.split(",");
let west: string,
east: string,
__1: string,
south: string,
north: string,
__2: string;
if (parts.length == 4) [west, east, south, north] = parts;
else [west, east, __1, south, north, __2] = parts;

const bb = new BoundingBox(
Number(west),
Number(east),
Number(south),
Number(north),
);
return bb.projectBoundingBox(srs, WGS84_CODE);
}

/**
* Convert a supplied bbox definition string to a {BoundingBox}.
*
* There are two types of bbox definition strings:
* 1. Four parts, space separated (deprecated)
* 2. Four or six parts, comma separated. (3rd axis ignored)
*
* @param bboxString Bouding box provided string
* @param srs The SRS in which to interpret this bboxstring
*/
export function suppliedBoundingBox(
bboxString: string,
srs: proj4.Converter | string,
): BoundingBox {
try {
const [west, east, south, north] = bbstring
.split(" ", 4)
.map((c) => Number(c));
const bb = new BoundingBox(west, east, south, north);
return bb.projectBoundingBox(inCRS, WGS84_CODE);
if (bboxString.includes(" ")) return spaceSepBbox(bboxString, srs);
return commaSepBbox(bboxString, srs);
} catch (e) {
Bye(
`Bounding box could not be parsed. Provide as a single space-separated string:`,
`"{min long (west)} {max long (east)} {min lat (south)} {max lat (north)}".`,
`Bounding box could not be parsed. Provide a single comma-separated string:`,
`"{min long (west)},{max long (east)},{min lat (south)},{max lat (north)}".`,
);
}
}
15 changes: 15 additions & 0 deletions src/cli-error.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,18 @@ export function Warn(message: string, ...optionalParams: any[]): void {
);
else console.warn(`# Warning: ${message}`, ...optionalParams);
}

let WARNINGS: Record<string, number> = {};

/** Collect warnings and output with call counts with OutputWarnCounts() */
export function CountWarn(message: string): void {
const value = WARNINGS[message];
WARNINGS[message] = value === undefined ? 1 : value + 1;
}

/** Output collected warnings (CountWarn) with call counts */
export function OutputWarnCounts(): void {
for (const [message, count] of Object.entries(WARNINGS))
Warn(`${message}: ${count}`);
WARNINGS = {};
}
Loading