Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ending shop/office catch-all #5014

Open
imagico opened this issue Sep 8, 2024 · 4 comments
Open

Ending shop/office catch-all #5014

imagico opened this issue Sep 8, 2024 · 4 comments

Comments

@imagico
Copy link
Collaborator

imagico commented Sep 8, 2024

In light of the work of @matkoniecz on cleaning up non-standard shop values i think we should re-consider the catch-all we have for rendering generic dot symbols for shop=* and office=*

Background

For background: We had introduced the catch-all rendering in #2415 and it was already controversial decision back then. Since then we had an attempt in #3730 to remove the catch-all in favor of a script generated list of common values - but that was not merged. We have in #3718 removed support for shop=yes - leading to the inconsistency documented in #4906. Office catch-all was introduced in #3163. In both cases any and all shop=* and office=* values other than those in a short exclusion list (like 'no') are shown with a generic dot and a label just like any other well established value that is not explicitly rendered with a dedicated symbol.

Reasoning

Reasoning for my suggestion to stop this is, that, while the catch-all supports the low-barrier introduction of new values with positive feedback and this way supports mapping a diverse geography, it also has the substantial negative effect of providing positive feedback on mis-taggings like typos and introduction of new synonyms to existing tagging - including misuse of shop=* to map non-shop POIs and make them appear with a dot and label.

As OSM grows more and more, the number of established shop=* and office=* increases (both in numbers and in diversity) while the number of real world shops for which no fitting classification exists yet, that require the invention of a new shop type, decreases. This makes the catch-all less useful with every year coming and going.

The difficulty - and that was already visible in #3730 and #3163 - is, of course, the generation and maintainance of a list of accepted values for the generic dot rendering. We essentially introduced the catch-all as an easy solution to that problem.

Concrete suggestion

My recommendation at this stage is:

Reasoning for the idea to use a fairly extensive list:

  • this is the most conservative approach, removing only rendering of such values that are really rare (a cut-off of 10 uses would mean supporting 817 values).
  • it would avoid any manual maintenance because all values that are important for principal reasons independent of their use numbers will be included in that list.
  • it would allow early positive feedback on culture specific shop and office types that might only exist in a relatively small region with low number. This would underline our support for a diverse audience.

Reasoning for the idea of using a database table instead of a shop IN ('foo', ...) or an inline table:

  • for a long list this will likely be more efficient since an index can be used for fast lookup (not sure if PostgreSQL does any kind of pre-computing for long in-lined lists of values).

An open question would still be if we should pre-generate the list from taginfo data for every release or if we should require style users to run the script for that similar to get-external-data.py. Considering our very irregular release pattern more recently the latter might make more sense.

@matkoniecz
Copy link
Contributor

Based on my research that you linked (I actually had talk about it on SOTM today) I discovered two problems

  • many important shop types are not documented, especially outside Europe (shop=tortilla go documented very recently)
  • some shops are hard to tag using current scheme with no obvious alternatives

I planned to (mostly) solve it before asking to merge #3730 but I keep finding more cases where such change would encourage people to damage data (by retagging to mismatching ones). And I keep finding more and more of cases like this as I continue research.

So actually I am still working on this but this seems nowhere near close to be finished :/

If anyone is interested I can dig out list of cases which are not documented and have no known tagging solution or look like a good tags.

@imagico
Copy link
Collaborator Author

imagico commented Sep 8, 2024

We need to separate between the mapping/tagging issues and the rendering questions. I don't think acute developments in tagging practice should have a significant impact on our rendering decisions here. The problem discussed here existed for many years and we need to look at it with a long term perspective. The need to introduce new shop values is not going to go away, this is going to stay a necessity - which is why the proposed solution aims to allow to handle this dynamically and automatically. The realization that it is probably not possible to produce a hand curated list of supported values without introducing substantial cultural bias is one of the main reasons for me suggesting the approach outlined. I agree that the existence of wiki documentation of a tag is not a good indicator for its suitability to be rendered.

The underlying problem with shops is largely that mappers have very early on decided on using a very fine grained classification in primary tagging. Differences that would otherwise be expressed with secondary tags are with shops usually part of the primary classification. For example every type of shop dedicated at selling a specific kind of region specific food needs a unique primary tag (like shop=tortilla, but also things like shop=olive_oil, shop=kimchi).

Everyone should keep in mind that this is not about which shops are rendered with a dedicated pictorial symbol, this is about the generic shop dots that are displayed for those shops for which we do not have a dedicated rendering.

@dch0ph
Copy link
Contributor

dch0ph commented Sep 8, 2024

This sounds an excellent way forward. Realistically, scalable solutions to POIs will need some kind of POI database table. I tried something along these lines in my aborted attempt to tackle #3880.

@imagico
Copy link
Collaborator Author

imagico commented Sep 8, 2024

To be clear: the table needed here for the approach proposed would be a simple single column table with the shop values that are considered valid. This would be created and filled by a python script that gets the list of shop values from taginfo and cuts it off at a certain threshold. If we'd use the same table for shops and offices that would mean an additional column - or we could just have two tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants