Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roads vs tunnels vs bridges SQL queries #5033

Open
4 tasks done
StyXman opened this issue Oct 29, 2024 · 13 comments · May be fixed by #5035
Open
4 tasks done

roads vs tunnels vs bridges SQL queries #5033

StyXman opened this issue Oct 29, 2024 · 13 comments · May be fixed by #5035
Labels

Comments

@StyXman
Copy link
Contributor

StyXman commented Oct 29, 2024

I noticed that the queries for those layers are slightly different, but in my opinion not enough to really have separate definitions. Here is what I found. I'm taking as a base the one that is already used elsewhere, roads_sql.

Todos (copied from below):

  • Make sure all three have the same code except the WHERE clauses. Add comments about keeping them in sync.
  • NULL AS int_surface on the railway branch
  • NULL AS access on the railway branch
  • NULL AS service on the railway branch

Short version

Both queries differ only slightly from the roads query. In particular, they differ the same way, except on how they focus on the topic they handle. Even more, I modified my fork of this style to have a single definition for all 4 layers (namely: tunnels, roads-casing, bridges and roads-fill) and it seems to work just fine. In particular, given how both tunnels and bridges use less sophisticates expression for filtering tunnels and bridges, I have the impression they have suffered from rotting, where roads_sql has been improved but the other two hasn't.

tunnels

  • service does not include slipways
  • does not include osm_id
  • less sophisticated handling of tunnel and covered
  • does not include bridges [@]
  • removes surface handling on railways
  • no location handling
  • inverted order for paved vs unpaved.
--- roads.sql   2024-10-29 21:21:43.638768799 +0100
+++ tunnels.sql 2024-10-29 21:21:07.382945122 +0100
@@ -9,7 +9,7 @@
             link,
             preserved,
             layernotnull
-          FROM ( -- subselect that contains both roads and rail/aero
+          FROM ( -- subselect that contains both roads and rail
             SELECT
                 way,
                 'highway_' || (CASE WHEN highway = 'path' THEN carto_path_type(bicycle, horse) ELSE highway END) AS feature,
@@ -22,21 +22,18 @@
                 carto_highway_int_access(highway, access, foot, bicycle, horse, tags->'motorcar', tags->'motor_vehicle', tags->'vehicle') AS int_access,
                 construction,
                 CASE
-                  WHEN service IN ('parking_aisle', 'drive-through', 'driveway') OR leisure IN ('slipway') THEN 'INT-minor'::text
+                  WHEN service IN ('parking_aisle', 'drive-through', 'driveway') THEN 'INT-minor'::text
                   ELSE 'INT-normal'::text
                 END AS service,
                 CASE
                   WHEN highway IN ('motorway_link', 'trunk_link', 'primary_link', 'secondary_link', 'tertiary_link') THEN 'yes'
                   ELSE 'no'
                 END AS link,
-                'no' AS preserved,
+                'no' as preserved,
                 COALESCE(layer,0) AS layernotnull,
-                osm_id,
                 z_order
               FROM planet_osm_line
-              WHERE (tunnel IS NULL OR NOT tunnel IN ('yes', 'building_passage'))
-                AND (covered IS NULL OR NOT covered = 'yes')
-                AND (bridge IS NULL OR NOT bridge IN ('yes', 'boardwalk', 'cantilever', 'covered', 'low_water_crossing', 'movable', 'trestle', 'viaduct'))
+              WHERE (tunnel = 'yes' OR tunnel = 'building_passage' OR covered = 'yes')
                 AND highway IS NOT NULL -- end of road select
             UNION ALL
             SELECT
@@ -46,29 +43,20 @@
                                  WHEN (railway = 'tram' AND service IN ('spur', 'siding', 'yard')) THEN 'tram-service'
                                  ELSE railway END) AS feature,
                 tracktype,
-                CASE WHEN surface IN ('unpaved', 'compacted', 'dirt', 'earth', 'fine_gravel', 'grass', 'grass_paver', 'gravel', 'ground',
-                                      'mud', 'pebblestone', 'salt', 'sand', 'woodchips', 'clay', 'ice', 'snow') THEN 'unpaved'
-                  WHEN surface IN ('paved', 'asphalt', 'cobblestone', 'cobblestone:flattened', 'sett', 'concrete', 'concrete:lanes',
-                                      'concrete:plates', 'paving_stones', 'metal', 'wood', 'unhewn_cobblestone') THEN 'paved'
-                  ELSE NULL
-                END AS int_surface,
+                'null',
                 NULL,
                 construction,
-                CASE WHEN service IN ('parking_aisle', 'drive-through', 'driveway') OR leisure IN ('slipway') THEN 'INT-minor'::text ELSE 'INT-normal'::text END AS service,
+                CASE WHEN service IN ('parking_aisle', 'drive-through', 'driveway') THEN 'INT-minor'::text ELSE 'INT-normal'::text END AS service,
                 'no' AS link,
                 (CASE
                   WHEN tags->'railway:preserved' = 'yes' THEN 'yes'
                   ELSE 'no'
                 END) AS preserved,
                 COALESCE(layer,0) AS layernotnull,
-                osm_id,
                 z_order
               FROM planet_osm_line
-              WHERE (tunnel IS NULL OR NOT tunnel IN ('yes', 'building_passage'))
-                AND (covered IS NULL OR NOT covered = 'yes' OR railway IN ('platform'))
-                AND (bridge IS NULL OR NOT bridge IN ('yes', 'boardwalk', 'cantilever', 'covered', 'low_water_crossing', 'movable', 'trestle', 'viaduct'))
-                AND (railway NOT IN ('platform') OR (tags->'location' NOT IN ('underground')) OR (tags->'location') IS NULL)
-                AND railway IS NOT NULL -- end of rail select
+              WHERE (tunnel = 'yes' OR tunnel = 'building_passage' OR covered = 'yes')
+                AND (railway NOT IN ('platform') AND railway IS NOT NULL) -- end of rail select
             ) AS features
           ORDER BY
             layernotnull,
@@ -76,6 +64,5 @@
             CASE WHEN substring(feature for 8) = 'railway_' THEN 2 ELSE 1 END,
             CASE WHEN feature IN ('railway_INT-spur-siding-yard', 'railway_tram-service') THEN 0 ELSE 1 END,
             CASE int_access WHEN 'no' THEN 0 WHEN 'restricted' THEN 1 ELSE 2 END,
-            CASE WHEN int_surface IN ('unpaved') THEN 0 ELSE 1 END,
-            osm_id
-        ) AS roads_sql
+            CASE WHEN int_surface IN ('unpaved') THEN 0 ELSE 1 END
+        ) AS tunnels

bridges

  • service does not include slipways
  • does not include osm_id
  • less sophisticated handling of bridge
  • does not include tunnels or covered [@]
  • removes surface handling on railways
  • no location handling
  • inverted order for paved vs unpaved.
--- roads.sql   2024-10-29 21:21:43.638768799 +0100
+++ bridges.sql 2024-10-29 21:19:37.047410928 +0100
@@ -22,7 +22,7 @@
                 carto_highway_int_access(highway, access, foot, bicycle, horse, tags->'motorcar', tags->'motor_vehicle', tags->'vehicle') AS int_access,
                 construction,
                 CASE
-                  WHEN service IN ('parking_aisle', 'drive-through', 'driveway') OR leisure IN ('slipway') THEN 'INT-minor'::text
+                  WHEN service IN ('parking_aisle', 'drive-through', 'driveway') THEN 'INT-minor'::text
                   ELSE 'INT-normal'::text
                 END AS service,
                 CASE
@@ -31,12 +31,9 @@
                 END AS link,
                 'no' AS preserved,
                 COALESCE(layer,0) AS layernotnull,
-                osm_id,
                 z_order
               FROM planet_osm_line
-              WHERE (tunnel IS NULL OR NOT tunnel IN ('yes', 'building_passage'))
-                AND (covered IS NULL OR NOT covered = 'yes')
-                AND (bridge IS NULL OR NOT bridge IN ('yes', 'boardwalk', 'cantilever', 'covered', 'low_water_crossing', 'movable', 'trestle', 'viaduct'))
+              WHERE bridge IN ('yes', 'boardwalk', 'cantilever', 'covered', 'low_water_crossing', 'movable', 'trestle', 'viaduct')
                 AND highway IS NOT NULL -- end of road select
             UNION ALL
             SELECT
@@ -46,28 +43,19 @@
                                  WHEN (railway = 'tram' AND service IN ('spur', 'siding', 'yard')) THEN 'tram-service'
                                  ELSE railway END) AS feature,
                 tracktype,
-                CASE WHEN surface IN ('unpaved', 'compacted', 'dirt', 'earth', 'fine_gravel', 'grass', 'grass_paver', 'gravel', 'ground',
-                                      'mud', 'pebblestone', 'salt', 'sand', 'woodchips', 'clay', 'ice', 'snow') THEN 'unpaved'
-                  WHEN surface IN ('paved', 'asphalt', 'cobblestone', 'cobblestone:flattened', 'sett', 'concrete', 'concrete:lanes',
-                                      'concrete:plates', 'paving_stones', 'metal', 'wood', 'unhewn_cobblestone') THEN 'paved'
-                  ELSE NULL
-                END AS int_surface,
+                'null',
                 NULL,
                 construction,
-                CASE WHEN service IN ('parking_aisle', 'drive-through', 'driveway') OR leisure IN ('slipway') THEN 'INT-minor'::text ELSE 'INT-normal'::text END AS service,
+                CASE WHEN service IN ('parking_aisle', 'drive-through', 'driveway') THEN 'INT-minor'::text ELSE 'INT-normal'::text END AS service,
                 'no' AS link,
                 (CASE
                   WHEN tags->'railway:preserved' = 'yes' THEN 'yes'
                   ELSE 'no'
                 END) AS preserved,
                 COALESCE(layer,0) AS layernotnull,
-                osm_id,
                 z_order
               FROM planet_osm_line
-              WHERE (tunnel IS NULL OR NOT tunnel IN ('yes', 'building_passage'))
-                AND (covered IS NULL OR NOT covered = 'yes' OR railway IN ('platform'))
-                AND (bridge IS NULL OR NOT bridge IN ('yes', 'boardwalk', 'cantilever', 'covered', 'low_water_crossing', 'movable', 'trestle', 'viaduct'))
-                AND (railway NOT IN ('platform') OR (tags->'location' NOT IN ('underground')) OR (tags->'location') IS NULL)
+              WHERE bridge IN ('yes', 'boardwalk', 'cantilever', 'covered', 'low_water_crossing', 'movable', 'trestle', 'viaduct')
                 AND railway IS NOT NULL -- end of rail select
             ) AS features
           ORDER BY
@@ -76,6 +64,5 @@
             CASE WHEN substring(feature for 8) = 'railway_' THEN 2 ELSE 1 END,
             CASE WHEN feature IN ('railway_INT-spur-siding-yard', 'railway_tram-service') THEN 0 ELSE 1 END,
             CASE int_access WHEN 'no' THEN 0 WHEN 'restricted' THEN 1 ELSE 2 END,
-            CASE WHEN int_surface IN ('unpaved') THEN 0 ELSE 1 END,
-            osm_id
-        ) AS roads_sql
+            CASE WHEN int_surface IN ('unpaved') THEN 0 ELSE 1 END
+        ) AS bridges

Proposal

Given that:

  • the changes these layers have are not that big
  • features are cached between layers
  • having slightly different queries seem to have introduced rotting

I propose to merge all 3 queries into a single query, which should include the simplification in surface handling for the railway branch, and probably similar simplification for other fields that make no sense for those ways.

I think I can provide such a patch without much effort; for the moment I'll keep developing it on my own style.

@StyXman StyXman changed the title roads vs tunnels vs bridges SQl queries roads vs tunnels vs bridges SQL queries Oct 29, 2024
@StyXman
Copy link
Contributor Author

StyXman commented Oct 29, 2024

Ok, no, in fact the WHEREs a quite importan (duh; I shouldn't come up with these ideas so late at night). Still, I have the impression there is some rotting on the tunnel and bridges queries, so they could be updated, and that we can do some extra optimization in the railway branch.

@imagico
Copy link
Collaborator

imagico commented Oct 29, 2024

Regarding if the queries for the roads, tunnels and bridges layers can be unified - you answered that yourself i think. You don't really want the bridges and tunnels queries to unnecessarily include all the normal roads. Still: The duplication of a lot of code between those layer is annoying. Hence ideas to avoid this are welcome.

Beyond that: There is definitely room for improvement - and if you want to work on that this would be welcome as well. For example the service column could be set to NULL for railways - and the int_surface and int_access columns should IMO be named in the railways queries for better readability.

You mention that

features are cached between layers

that is not true - at least not within Mapnik - PostgreSQL might cache when the same query is run multiple times. The cache-features parameter applies only when having several styles on the same layer (via attachments in CartoCSS).

@imagico imagico added the roads label Oct 29, 2024
@StyXman
Copy link
Contributor Author

StyXman commented Oct 30, 2024

You don't really want the bridges and tunnels queries to unnecessarily include all the normal roads

Yeah, I tried converting the bridges and tunnel info into fields and replace the id filters to things like [tunnel = 'yes'] and carto went from running in ~30s to not finishing after 5m :)

if you want to work on that this would be welcome as well

OK, here's my plan: I'll start introducing one change at a time, get a review, then move to the next one. The question whether you prefer a single final PR or multiple small ones is up to you, but I think several small ones is better.

the service column could be set to NULL for railways - and the int_surface and int_access columns should IMO be named in the railways queries for better readability

OK, let's make a checklist (they don't quite work, be we'll make them work :):

[ ] Make sure all three have the same code except the WHERE clauses. Add comments about keeping them in sync.
[ ] NULL AS int_surface on the railway branch
[ ] NULL AS access on the railway branch
[ ] NULL AS service on the railway branch

What about horse, bicycle, foot and tracktype? Feels like they all could become NULL AS foo too.

@imagico
Copy link
Collaborator

imagico commented Oct 30, 2024

OK, here's my plan: I'll start introducing one change at a time, get a review, then move to the next one. The question whether you prefer a single final PR or multiple small ones is up to you, but I think several small ones is better.

If you just want to do internal changes without a change in rendering, grouping several smaller changes in usually fine. If you want to do larger re-structuring or visible changes it is better not to do too many things at once or to mix structural with simple form changes.

OK, let's make a checklist (they don't quite work, be we'll make them work :):

I copied that to the initial comment so github shows the progress.

What about horse, bicycle, foot and tracktype? Feels like they all could become NULL AS foo too.

horse, bicycle and foot are gone since #4952. tracktype should go completely as well - #4322. 😏

@dch0ph
Copy link
Contributor

dch0ph commented Oct 30, 2024

Thanks for taking this on - I had also noticed a lot of cruft in the roads queries (e.g. creating an int_surface for railways!). It would be really good to clean up.

I'm puzzled why the roads queries contain things like 'INT-minor'::text. The ::text seems redundant?

Another oddity is why some "null" values are returned as 'null' rather than simply NULL?

A final oddity is why CASE WHEN feature IN ('highway_motorway_link', 'highway_trunk_link', 'highway_primary_link', 'highway_secondary_link', 'highway_tertiary_link') is evaluated, when link has already been evaluated to yes or no?

Plenty to go at!

@StyXman
Copy link
Contributor Author

StyXman commented Oct 31, 2024

What about 'null' AS surface in the railway branch? Would NULL be better?

@imagico can you show me how to do one of those demo data sources?

@imagico
Copy link
Collaborator

imagico commented Oct 31, 2024

What about 'null' AS surface in the railway branch? Would NULL be better?

I think yes. Mapnik/Carto handling of NULL values has in the past been somewhat obscure which has probably let to the habit of using 'null' strings. I don't think this is necessary. We don't even have a filter for surface on railways in MSS i think so the value of surface there is simply irrelevant.

@imagico can you show me how to do one of those demo data sources?

You mean the tag combination test sets like here - those are drawn in JOSM and run through osmium renumber and then imported into a test database.

@StyXman
Copy link
Contributor Author

StyXman commented Oct 31, 2024

Mapnik/Carto handling of NULL values has in the past been somewhat obscure

I agree. I think once I tried to come up with what's actually going on there and came up gaslighted by mapnik. I decided never to look again and do NULL vs 'null' incantations until it behaves like I need to. Good to hear if it's more stable now :)

those are drawn in JOSM

Huh... :)

@StyXman StyXman linked a pull request Nov 1, 2024 that will close this issue
@StyXman
Copy link
Contributor Author

StyXman commented Nov 1, 2024

Ok, I started with this. The specific plan is like this: Each change goes in it's own commit, so if a particular thing is not liked it can easily be rolled back. I'll fix the query for roads-casing, and once we're satisfied, I'll copy it over the other 2, leaving the WHERE clauses intact.

@StyXman
Copy link
Contributor Author

StyXman commented Nov 1, 2024

OK, I think this is enough for now. I will also like to add spaces after commas where permitted (not in regexps), to make it more consistent, like in COALESCE(layer,0), but I either leave that for the last or I'll do in another PR.

@imagico
Copy link
Collaborator

imagico commented Nov 1, 2024

Space after comma is currently not part of our SQL style guide. We mostly have spaces between items in IN(...) but we mostly do not have spaces between function parameters. Changing that to be fully consistent would mean a lot of changes in project.mml. I am not sure if that is worth the trouble.

In any case - changes to formatting of globally re-used blocks like COALESCE(layer,0) should not be mixed with changes to individual queries.

@StyXman
Copy link
Contributor Author

StyXman commented Nov 2, 2024

OK, so no objections so far. I'll copy the query over the other two and let's see what it gives.

@StyXman
Copy link
Contributor Author

StyXman commented Nov 2, 2024

Done, and set to ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants