Skip to content

Commit

Permalink
Merge pull request #529 from mapbox/files-left-open
Browse files Browse the repository at this point in the history
Close input files that were being left open after parallel reading
  • Loading branch information
e-n-f authored Feb 12, 2018
2 parents e2a3492 + 2d625d5 commit 03577cf
Show file tree
Hide file tree
Showing 15 changed files with 12,623 additions and 33 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
## 1.27.7

* Add an option to produce only a single tile
* Retain non-ASCII characters in layernames generated from filenames
* Remember to close input files after reading them
* Add --coalesce-fraction-as-needed and --coalesce-densest-as-needed
* Report distances in both feet and meters

## 1.27.6

* Fix opportunities for integer overflow and out-of-bounds references
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ test: tippecanoe tippecanoe-decode $(addsuffix .check,$(TESTS)) raw-tiles-test p

# Work around Makefile and filename punctuation limits: _ for space, @ for :, % for /
%.json.check:
./tippecanoe -aD -f -o $@.mbtiles $(subst @,:,$(subst %,/,$(subst _, ,$(patsubst %.json.check,%,$(word 4,$(subst /, ,$@)))))) $(wildcard $(subst $(SPACE),/,$(wordlist 1,2,$(subst /, ,$@)))/*.json) < /dev/null
./tippecanoe -a@ -f -o $@.mbtiles $(subst @,:,$(subst %,/,$(subst _, ,$(patsubst %.json.check,%,$(word 4,$(subst /, ,$@)))))) $(wildcard $(subst $(SPACE),/,$(wordlist 1,2,$(subst /, ,$@)))/*.json) < /dev/null
./tippecanoe-decode $@.mbtiles > $@.out
cmp $@.out $(patsubst %.check,%,$@)
rm $@.out $@.mbtiles
Expand All @@ -101,7 +101,7 @@ fewer-tests: tippecanoe tippecanoe-decode geobuf-test raw-tiles-test parallel-te
# XXX Use proper makefile rules instead of a for loop
%.json.checkbuf:
for i in $(wildcard $(subst $(SPACE),/,$(wordlist 1,2,$(subst /, ,$@)))/*.json); do ./tippecanoe-json-tool -w $$i | ./node_modules/geobuf/bin/json2geobuf > $$i.geobuf; done
./tippecanoe -aD -f -o $@.mbtiles $(subst @,:,$(subst %,/,$(subst _, ,$(patsubst %.json.checkbuf,%,$(word 4,$(subst /, ,$@)))))) $(addsuffix .geobuf,$(wildcard $(subst $(SPACE),/,$(wordlist 1,2,$(subst /, ,$@)))/*.json)) < /dev/null
./tippecanoe -a@ -f -o $@.mbtiles $(subst @,:,$(subst %,/,$(subst _, ,$(patsubst %.json.checkbuf,%,$(word 4,$(subst /, ,$@)))))) $(addsuffix .geobuf,$(wildcard $(subst $(SPACE),/,$(wordlist 1,2,$(subst /, ,$@)))/*.json)) < /dev/null
./tippecanoe-decode $@.mbtiles | sed 's/checkbuf/check/g' > $@.out
cmp $@.out $(patsubst %.checkbuf,%,$@)
rm $@.out $@.mbtiles
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,8 @@ Parallel processing will also be automatic if the input file is in Geobuf format
* `-ae` or `--extend-zooms-if-still-dropping`: Increase the maxzoom if features are still being dropped at that zoom level.
The detail and simplification options that ordinarily apply only to the maximum zoom level will apply both to the originally
specified maximum zoom and to any levels added beyond that.
* `-R` _zoom_`/`_x_`/`_y_ or `--one-tile=`_zoom_`/`_x_`/`_y_: Set the minzoom and maxzoom to _zoom_ and produce only
the single specified tile at that zoom level.

### Tile resolution

Expand Down Expand Up @@ -212,6 +214,8 @@ tippecanoe -z5 -o filtered.mbtiles -j '{ "ne_10m_admin_0_countries": [ "all", [
* `-ad` or `--drop-fraction-as-needed`: Dynamically drop some fraction of features from each zoom level to keep large tiles under the 500K size limit. (This is like `-pd` but applies to the entire zoom level, not to each tile.)
* `-an` or `--drop-smallest-as-needed`: Dynamically drop the smallest features (physically smallest: the shortest lines or the smallest polygons) from each zoom level to keep large tiles under the 500K size limit. This option will not work for point features.
* `-aN` or `--coalesce-smallest-as-needed`: Dynamically combine the smallest features (physically smallest: the shortest lines or the smallest polygons) from each zoom level into other nearby features to keep large tiles under the 500K size limit. This option will not work for point features, and will probably not help very much with LineStrings. It is mostly intended for polygons, to maintain the full original area covered by polygons while still reducing the feature count somehow. The attributes of the small polygons are *not* preserved into the combined features, only their geometry.
* `-aD` or `--coalesce-densest-as-needed`: Dynamically combine the densest features from each zoom level into other nearby features to keep large tiles under the 500K size limit. (Again, mostly useful for polygons.)
* `-aS` or `--coalesce-fraction-as-needed`: Dynamically combine a fraction of features from each zoom level into other nearby features to keep large tiles under the 500K size limit. (Again, mostly useful for polygons.)
* `-pd` or `--force-feature-limit`: Dynamically drop some fraction of features from large tiles to keep them under the 500K size limit. It will probably look ugly at the tile boundaries. (This is like `-ad` but applies to each tile individually, not to the entire zoom level.) You probably don't want to use this.
* `-aC` or `--cluster-densest-as-needed`: If a tile is too large, try to reduce its size by increasing the minimum spacing between features, and leaving one placeholder feature from each group. The remaining feature will be given a `"cluster": true` attribute to indicate that it represents a cluster and a `"point_count"` attribute to indicate the number of features that were clustered into it.

Expand Down
76 changes: 70 additions & 6 deletions main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
#include "mvt.hpp"
#include "dirtiles.hpp"
#include "evaluator.hpp"
#include "text.hpp"

static int low_detail = 12;
static int full_detail = -1;
Expand All @@ -70,6 +71,7 @@ double simplification = 1;
size_t max_tile_size = 500000;
size_t max_tile_features = 200000;
int cluster_distance = 0;
long justx = -1, justy = -1;

int prevent[256];
int additional[256];
Expand Down Expand Up @@ -1169,16 +1171,20 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo
}

// Trim out characters that can't be part of selector
std::string out;
std::string out;
for (size_t p = 0; p < trunc.size(); p++) {
if (isalpha(trunc[p]) || isdigit(trunc[p]) || trunc[p] == '_') {
if (isalpha(trunc[p]) || isdigit(trunc[p]) || trunc[p] == '_' || (trunc[p] & 0x80) != 0) {
out.append(trunc, p, 1);
}
}

sources[l].layer = out;
if (sources[l].layer.size() == 0 || check_utf8(out).size() != 0) {
sources[l].layer = "unknown" + std::to_string(l);
}

if (!quiet) {
fprintf(stderr, "For layer %d, using name \"%s\"\n", (int) l, out.c_str());
fprintf(stderr, "For layer %d, using name \"%s\"\n", (int) l, sources[l].layer.c_str());
}
}
}
Expand All @@ -1198,6 +1204,16 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo
double dist_sum = 0;
size_t dist_count = 0;

int files_open_before_reading = open("/dev/null", O_RDONLY | O_CLOEXEC);
if (files_open_before_reading < 0) {
perror("open /dev/null");
exit(EXIT_FAILURE);
}
if (close(files_open_before_reading) != 0) {
perror("close");
exit(EXIT_FAILURE);
}

size_t nsources = sources.size();
for (size_t source = 0; source < nsources; source++) {
std::string reading;
Expand Down Expand Up @@ -1333,6 +1349,11 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo

parse_geocsv(sst, sources[source].file, layer, sources[layer].layer);

if (close(fd) != 0) {
perror("close");
exit(EXIT_FAILURE);
}

overall_offset = layer_seq[0];
checkdisk(&readers);
continue;
Expand Down Expand Up @@ -1383,6 +1404,11 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo
perror("munmap source file");
exit(EXIT_FAILURE);
}

if (close(fd) != 0) {
perror("close input file");
exit(EXIT_FAILURE);
}
} else {
FILE *fp = fdopen(fd, "r");
if (fp == NULL) {
Expand Down Expand Up @@ -1543,6 +1569,22 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo
}
}

int files_open_after_reading = open("/dev/null", O_RDONLY | O_CLOEXEC);
if (files_open_after_reading < 0) {
perror("open /dev/null");
exit(EXIT_FAILURE);
}
if (close(files_open_after_reading) != 0) {
perror("close");
exit(EXIT_FAILURE);
}

if (files_open_after_reading > files_open_before_reading) {
fprintf(stderr, "Internal error: Files left open after reading input. (%d vs %d)\n",
files_open_before_reading, files_open_after_reading);
ret = EXIT_FAILURE;
}

if (!quiet) {
fprintf(stderr, " \r");
// (stderr, "Read 10000.00 million features\r", *progress_seq / 1000000.0);
Expand Down Expand Up @@ -1718,6 +1760,12 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo
unsigned iz = 0, ix = 0, iy = 0;
choose_first_zoom(file_bbox, readers, &iz, &ix, &iy, minzoom, buffer);

if (justx >= 0) {
iz = minzoom;
ix = justx;
iy = justy;
}

long long geompos = 0;

/* initial tile is 0/0/0 */
Expand Down Expand Up @@ -1817,7 +1865,7 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo
}

if (!quiet) {
fprintf(stderr, "Choosing a maxzoom of -z%d for features about %d feet apart\n", maxzoom, (int) ceil(dist_ft));
fprintf(stderr, "Choosing a maxzoom of -z%d for features about %d feet (%d meters) apart\n", maxzoom, (int) ceil(dist_ft), (int) ceil(dist_ft / 3.28084));
}
}

Expand All @@ -1834,7 +1882,7 @@ int read_input(std::vector<source> &sources, char *fname, int maxzoom, int minzo

if (mz > maxzoom || count <= 0) {
if (!quiet) {
fprintf(stderr, "Choosing a maxzoom of -z%d for resolution of about %d feet within features\n", mz, (int) exp(dist_sum / dist_count));
fprintf(stderr, "Choosing a maxzoom of -z%d for resolution of about %d feet (%d meters) within features\n", mz, (int) exp(dist_sum / dist_count), (int) (exp(dist_sum / dist_count) / 3.28084));
}
maxzoom = mz;
}
Expand Down Expand Up @@ -2253,6 +2301,7 @@ int main(int argc, char **argv) {
{"maximum-zoom", required_argument, 0, 'z'},
{"minimum-zoom", required_argument, 0, 'Z'},
{"extend-zooms-if-still-dropping", no_argument, &additional[A_EXTEND_ZOOMS], 1},
{"one-tile", required_argument, 0, 'R'},

{"Tile resolution", 0, 0, 0},
{"full-detail", required_argument, 0, 'd'},
Expand All @@ -2274,10 +2323,12 @@ int main(int argc, char **argv) {
{"drop-polygons", no_argument, &additional[A_POLYGON_DROP], 1},
{"cluster-distance", required_argument, 0, 'K'},

{"Dropping a fraction of features to keep under tile size limits", 0, 0, 0},
{"Dropping or merging a fraction of features to keep under tile size limits", 0, 0, 0},
{"drop-densest-as-needed", no_argument, &additional[A_DROP_DENSEST_AS_NEEDED], 1},
{"drop-fraction-as-needed", no_argument, &additional[A_DROP_FRACTION_AS_NEEDED], 1},
{"drop-smallest-as-needed", no_argument, &additional[A_DROP_SMALLEST_AS_NEEDED], 1},
{"coalesce-densest-as-needed", no_argument, &additional[A_COALESCE_DENSEST_AS_NEEDED], 1},
{"coalesce-fraction-as-needed", no_argument, &additional[A_COALESCE_FRACTION_AS_NEEDED], 1},
{"coalesce-smallest-as-needed", no_argument, &additional[A_COALESCE_SMALLEST_AS_NEEDED], 1},
{"force-feature-limit", no_argument, &prevent[P_DYNAMIC_DROP], 1},
{"cluster-densest-as-needed", no_argument, &additional[A_CLUSTER_DENSEST_AS_NEEDED], 1},
Expand Down Expand Up @@ -2428,6 +2479,19 @@ int main(int argc, char **argv) {
minzoom = atoi(optarg);
break;

case 'R': {
unsigned z, x, y;
if (sscanf(optarg, "%u/%u/%u", &z, &x, &y) == 3) {
minzoom = z;
maxzoom = z;
justx = x;
justy = y;
} else {
fprintf(stderr, "--one-tile argument must be z/x/y\n");
exit(EXIT_FAILURE);
}
}

case 'B':
if (strcmp(optarg, "g") == 0) {
basezoom = -2;
Expand Down
7 changes: 7 additions & 0 deletions man/tippecanoe.1
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,9 @@ Parallel processing will also be automatic if the input file is in Geobuf format
\fB\fC\-ae\fR or \fB\fC\-\-extend\-zooms\-if\-still\-dropping\fR: Increase the maxzoom if features are still being dropped at that zoom level.
The detail and simplification options that ordinarily apply only to the maximum zoom level will apply both to the originally
specified maximum zoom and to any levels added beyond that.
.IP \(bu 2
\fB\fC\-R\fR \fIzoom\fP\fB\fC/\fR\fIx\fP\fB\fC/\fR\fIy\fP or \fB\fC\-\-one\-tile=\fR\fIzoom\fP\fB\fC/\fR\fIx\fP\fB\fC/\fR\fIy\fP: Set the minzoom and maxzoom to \fIzoom\fP and produce only
the single specified tile at that zoom level.
.RE
.SS Tile resolution
.RS
Expand Down Expand Up @@ -247,6 +250,10 @@ compensate for the larger marker, or \fB\fC\-Bf\fR\fInumber\fP to allow at most
.IP \(bu 2
\fB\fC\-aN\fR or \fB\fC\-\-coalesce\-smallest\-as\-needed\fR: Dynamically combine the smallest features (physically smallest: the shortest lines or the smallest polygons) from each zoom level into other nearby features to keep large tiles under the 500K size limit. This option will not work for point features, and will probably not help very much with LineStrings. It is mostly intended for polygons, to maintain the full original area covered by polygons while still reducing the feature count somehow. The attributes of the small polygons are \fInot\fP preserved into the combined features, only their geometry.
.IP \(bu 2
\fB\fC\-aD\fR or \fB\fC\-\-coalesce\-densest\-as\-needed\fR: Dynamically combine the densest features from each zoom level into other nearby features to keep large tiles under the 500K size limit. (Again, mostly useful for polygons.)
.IP \(bu 2
\fB\fC\-aS\fR or \fB\fC\-\-coalesce\-fraction\-as\-needed\fR: Dynamically combine a fraction of features from each zoom level into other nearby features to keep large tiles under the 500K size limit. (Again, mostly useful for polygons.)
.IP \(bu 2
\fB\fC\-pd\fR or \fB\fC\-\-force\-feature\-limit\fR: Dynamically drop some fraction of features from large tiles to keep them under the 500K size limit. It will probably look ugly at the tile boundaries. (This is like \fB\fC\-ad\fR but applies to each tile individually, not to the entire zoom level.) You probably don't want to use this.
.IP \(bu 2
\fB\fC\-aC\fR or \fB\fC\-\-cluster\-densest\-as\-needed\fR: If a tile is too large, try to reduce its size by increasing the minimum spacing between features, and leaving one placeholder feature from each group. The remaining feature will be given a \fB\fC"cluster": true\fR attribute to indicate that it represents a cluster and a \fB\fC"point_count"\fR attribute to indicate the number of features that were clustered into it.
Expand Down
4 changes: 3 additions & 1 deletion options.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
#define A_REVERSE ((int) 'r')
#define A_REORDER ((int) 'o')
#define A_LINE_DROP ((int) 'l')
#define A_DEBUG_POLYGON ((int) 'D')
#define A_DEBUG_POLYGON ((int) '@')
#define A_POLYGON_DROP ((int) 'p')
#define A_DETECT_SHARED_BORDERS ((int) 'b')
#define A_PREFER_RADIX_SORT ((int) 'R')
Expand All @@ -15,7 +15,9 @@
#define A_DROP_DENSEST_AS_NEEDED ((int) 's')
#define A_DROP_FRACTION_AS_NEEDED ((int) 'd')
#define A_DROP_SMALLEST_AS_NEEDED ((int) 'n')
#define A_COALESCE_DENSEST_AS_NEEDED ((int) 'S')
#define A_COALESCE_SMALLEST_AS_NEEDED ((int) 'N')
#define A_COALESCE_FRACTION_AS_NEEDED ((int) 'D')
#define A_GRID_LOW_ZOOMS ((int) 'L')
#define A_DETECT_WRAPAROUND ((int) 'w')
#define A_EXTEND_ZOOMS ((int) 'e')
Expand Down
2 changes: 1 addition & 1 deletion serial.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -491,7 +491,7 @@ int serialize_feature(struct serialization_state *sst, serial_feature &sf) {
long long midy = (sf.bbox[1] / 2 + sf.bbox[3] / 2) & ((1LL << 32) - 1);
bbox_index = encode(midx, midy);

if (additional[A_DROP_DENSEST_AS_NEEDED] || additional[A_CLUSTER_DENSEST_AS_NEEDED] || additional[A_CALCULATE_FEATURE_DENSITY] || additional[A_INCREASE_GAMMA_AS_NEEDED] || sst->uses_gamma || cluster_distance != 0) {
if (additional[A_DROP_DENSEST_AS_NEEDED] || additional[A_COALESCE_DENSEST_AS_NEEDED] || additional[A_CLUSTER_DENSEST_AS_NEEDED] || additional[A_CALCULATE_FEATURE_DENSITY] || additional[A_INCREASE_GAMMA_AS_NEEDED] || sst->uses_gamma || cluster_distance != 0) {
sf.index = bbox_index;
} else {
sf.index = 0;
Expand Down
54 changes: 54 additions & 0 deletions tests/ne_110m_admin_0_countries/out/-R5%17%11.json

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions tests/nonascii/@@@.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{ "type": "Feature", "properties": { }, "geometry": { "type": "LineString", "coordinates": [ [ -77.0366119, 38.9263374 ], [ 0, 0 ] ] } }
22 changes: 22 additions & 0 deletions tests/nonascii/out/-z0.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{ "type": "FeatureCollection", "properties": {
"bounds": "-122.238615,0.000000,0.000000,38.926337",
"center": "0.000000,0.000000,0",
"description": "tests/nonascii/out/-z0.json.check.mbtiles",
"format": "pbf",
"json": "{\"vector_layers\": [ { \"id\": \"unknown0\", \"description\": \"\", \"minzoom\": 0, \"maxzoom\": 0, \"fields\": {} }, { \"id\": \"堤防\", \"description\": \"\", \"minzoom\": 0, \"maxzoom\": 0, \"fields\": {} } ],\"tilestats\": {\"layerCount\": 2,\"layers\": [{\"layer\": \"unknown0\",\"count\": 1,\"geometry\": \"LineString\",\"attributeCount\": 0,\"attributes\": []},{\"layer\": \"堤防\",\"count\": 1,\"geometry\": \"LineString\",\"attributeCount\": 0,\"attributes\": []}]}}",
"maxzoom": "0",
"minzoom": "0",
"name": "tests/nonascii/out/-z0.json.check.mbtiles",
"type": "overlay",
"version": "2"
}, "features": [
{ "type": "FeatureCollection", "properties": { "zoom": 0, "x": 0, "y": 0 }, "features": [
{ "type": "FeatureCollection", "properties": { "layer": "unknown0", "version": 2, "extent": 4096 }, "features": [
{ "type": "Feature", "properties": { }, "geometry": { "type": "LineString", "coordinates": [ [ -77.080078, 38.959409 ], [ 0.000000, 0.000000 ] ] } }
] }
,
{ "type": "FeatureCollection", "properties": { "layer": "堤防", "version": 2, "extent": 4096 }, "features": [
{ "type": "Feature", "properties": { }, "geometry": { "type": "LineString", "coordinates": [ [ -122.255859, 37.788081 ], [ 0.000000, 0.000000 ] ] } }
] }
] }
] }
1 change: 1 addition & 0 deletions tests/nonascii/堤防.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{ "type": "Feature", "properties": { }, "geometry": { "type": "LineString", "coordinates": [ [ -122.238615, 37.783767 ], [ 0, 0 ] ] } }
Loading

0 comments on commit 03577cf

Please sign in to comment.