Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CARTOColor category schemes and interpolation #129

Open
makella opened this issue Mar 5, 2018 · 10 comments
Open

CARTOColor category schemes and interpolation #129

makella opened this issue Mar 5, 2018 · 10 comments
Assignees
Labels
area: cartography status: discussion Needs more discussion, to start or to be closed / merged type: research

Comments

@makella
Copy link
Contributor

makella commented Mar 5, 2018

@davidmanzanares

I do want to say, that I think the interpolation piece is good for when there are a large number of categories, but I need to understand better how the interpolation is happening with category colors.

Before we go there, I think we need to step back and understand better what is happening at the smaller, and fewer categories, level. I am not seeing the same behavior with all category schemes so I am wondering if you can help me understand better.

Below are a series of maps symbolized off a text field 13 categories (values A-M) using different qualitative CARTOColor schemes.

I can't pin down exactly what is happening, and maybe prism isn't the best scheme to use because my initial sense is that we find a color close to another color (or something).

Anyway, I'm confused!

Prism

At first, this looks right... but Argentina and Brazil are not part of the color scheme. Maybe others aren't either (based off the results illustrated for other color schemes below)

screen shot 2018-03-05 at 11 05 26 am

color: ramp($mapcolor13_text,prism)

MAP
screen shot 2018-03-05 at 9 48 53 am

Unique values 12 categories

When I assign unique values to 12 categories, the 13th, Argentina gets colored as other:

color: ramp(buckets($mapcolor13_text,"A","B","C","D","E","F","G","H","I","J","K","L"),prism)
MAP

BUT I have no idea where Suriname, Bolivia, or Chile are getting their fill colors from:

screen shot 2018-03-05 at 11 16 28 am

Unique values 13 categories

Totally different result where I think Bolivia and Uruguay are interpolated and potentially others.

color: ramp(buckets($mapcolor13_text,"A","B","C","D","E","F","G","H","I","J","K","L","M"),prism)

MAP

screen shot 2018-03-05 at 11 24 25 am

Bold

I'm not exactly sure how to describe what is happening here. From what I can see, we are going straight to interpolation (?). Maybe that is happening in the prism map but less noticeable (?) Below is the color scheme, followed by the output map

screen shot 2018-03-05 at 11 02 13 am

color: ramp($mapcolor13_text,bold)

MAP

screen shot 2018-03-05 at 11 04 54 am

@davidmanzanares
Copy link

Ok, so, I just added a few tests to check the current behavior, and it works as expected. Although I think that we may want to release a new cartocolor version with some new features and tweaks.

Ramp cases

I'll try to list every ramp case:

  1. ramp(linear(...), prism). It will create a continuous palette from the provided one, by taking the subpalette with most colors, it will remove the last color if it is an "others" color (this is known by looking at cartocolor tags). See https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/cartocolor-linear/reference.png https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/cartocolor-linear/scenario.js

  2. ramp($cat, prism) with a low number of categories (not more than the cartocolor). It will map each category to a color without interpolation. The others color won't be used. See https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/cartocolor-category/reference.png and https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/cartocolor-category/scenario.js

  3. ramp($cat, prism) with a high number of categories (more than the cartocolor). Similar to 1. It will create a continuous palette from the provided one, by taking the subpalette with most colors, it will remove the last color if it is an "others" color. See https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/category-interpolation/reference.png and https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/category-interpolation/scenario.js Note that I used a custom palette of just two colors and that it generated a middle color automatically since there are 3 categories. Similar behavior happens with cartocolors, but it's more difficult to see since they have much more colors.

  4. ramp(buckets($cat, 'catOne', 'catTwo'), prism). Like 2, but using the others color to the last bucket. It will map each category to a color without interpolation. The others color will be used (if the cartocolor has one) for the last "default" bucket. See https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/cartocolor-category-others/reference.png and https://github.com/CartoDB/renderer-prototype/blob/b815f99f6d0f36e28f126a34794d0dd120687eac/test/integration/render/scenarios/ramp/cartocolor-category-others/scenario.js

Summary

Color interpolation will be activated if
A) the input is a continuous value, like the output of linear()
B) there are not enough colors in the provided palette

Answer to the original question

Regarding your questions:

From what I can see, we are going straight to interpolation (?). Maybe that is happening in the prism map but less noticeable (?)

Yes, interpolation is being used in both examples, but in the first one is less noticeable because the colors are ordered.

Follow-up questions

We may want to release at some point a new cartocolor version with some adjustments and new features.

I think it would be better if we had ordered colors (like prism already has). Interpolating unordered palettes like bold can cause many problems, from interpolations that generate very similar colors in some cases to interpolations where some of the output colors are not very good at all. In general, when they are ordered the result is more controlled.

Additionally, we could export colorbrewer in a prettier form, see #31

Also, we could add palettes with different alpha values, since it is possible now, see: https://cartodb.github.io/renderer-prototype/example/mapbox.html#eyJhIjoibW5tYXBwbHV0byIsImIiOiIiLCJjIjoiZG1hbnphbmFyZXMiLCJkIjoiaHR0cHM6Ly97dXNlcn0uY2FydG8uY29tIiwiZSI6ImNvbG9yOiByYW1wKGxpbmVhcihsb2coJG51bWZsb29ycyksIDEsIDQpLCAgICBbIGhzdmEoMCwxLDEsMCksIGhzdmEoMCwwLDEsMSkgXSkiLCJmIjp7ImxuZyI6LTczLjk2NjYxNjI0ODY0MjQyLCJsYXQiOjQwLjc2OTA2NzAxOTAxNjk0NH0sImciOjExLjM2MTQ3MjE1NjIwNjIwNH0=

@davidmanzanares
Copy link

Note: I closed this since I think that the original issue is resolved, but the follow-ups are certainly something to be discussed.

@makella
Copy link
Contributor Author

makella commented May 1, 2018

@davidmanzanares I'm gonna reopen this specifically to talk about this part:

Interpolating unordered palettes like bold can cause many problems, from interpolations that generate very similar colors in some cases to interpolations where some of the output colors are not very good at all. In general, when they are ordered the result is more controlled.

I've been doing a lot of testing today and I'm curious to get more information about:

  • the default interpolation when maximum colors have been reached (people want to "see" all 50 categories but that is a human perception problem) are we hurting more than helping?
    • alternatively, if people do this, is there another set of color schemes that we should develop that have fewer colors but are meant specifically for category interpolation so we avoid the muddy colors?
  • the interpolation when we've reached the maximum colors: does it take the entire 11 color scheme and interpolate between colors that are next to each other?
  • why we are interpolating category colors for something like ramp(linear....)? That should be reserved for sequential schemes and numeric data and/or a specific thing that people request to do for example:
    • ramp(linear($amount,10,500),stretch(prism))
      but by default, you wouldn't be able to do:
    • ramp(linear($amount,10,500),prism))

Interpolation Current order

I totally see where the issues are coming from with the current ordering for example, in the images below, where the original vivid is stretched to 17 colors, I can clearly pick out where in the order we are producing the unattractive muddy colors.

This is related to a few different things that I can see:

  • the color space for interpolation (Lch looks best) but even with Lch, there are certain colors that should not be next to each other to get better mixing and more variation
  • the colors that are producing the muddy colors are ones that have less chroma
  • order matters for perceived differences between colors (for example, avoid "similar" looking colors when interpolated)

vivid original order:
screen shot 2018-05-01 at 3 57 48 pm

Stretched to 17 colors:
screen shot 2018-05-01 at 3 58 14 pm

Interpolation Reordered example

This is vivid reordered but as you can see when stretched, we begin to get more uniform colors

screen shot 2018-05-01 at 4 23 20 pm

screen shot 2018-05-01 at 4 23 35 pm

I have been running a side research project with Steph to test some adjustments schemes for better differentiation between categories, but specifically for Builder. She has come up with some adjustments to the ordering and slight saturation/brightness modifications.

Once you and I have had a chance to talk more in depth about the interpolation piece, I will start testing some of her proposed adjustments in the VL context.

We will find the happy medium for both Builder and VL!!

@makella
Copy link
Contributor Author

makella commented May 2, 2018

I also did some testing to try and understand the idea of "sub-palettes" better. The behaviour is different with pre-defined CARTOColor schemes (as you described above), but this is the pattern I am seeing when custom color values are defined which I think demonstrates what you mean by sub-palettes. Would be good to talk through this too.

screen shot 2018-05-02 at 8 26 50 am

@makella makella mentioned this issue May 2, 2018
7 tasks
@davidmanzanares
Copy link

Wow, so much detail, I'll try to answer the best I can.

the default interpolation when maximum colors have been reached (people want to "see" all 50 categories but that is a human perception problem) are we hurting more than helping?
alternatively, if people do this, is there another set of color schemes that we should develop that have fewer colors but are meant specifically for category interpolation so we avoid the muddy colors?

Exactly, it is a human perception problem. I think interpolation is better than nothing though. For example, in a dataset with 50 categories (and 50 colors), it will be almost impossible to differentiate them. However, if the user zooms into a region of just 4 categories, the colors probably can be differentiated.

Another option could be to use top always, so, grouping less common categories into the others bucket automatically. However, this would be a quite invasive default I think. I would prefer to document this issue and its solution: use top or use buckets to reduce the number of categories.

Regarding the color schemes, I guess that some schemes, the ones that use the entire hue space will be better, but, other than that, I don't have a better idea.

the interpolation when we've reached the maximum colors: does it take the entire 11 color scheme and interpolate between colors that are next to each other?

Yes, it takes the "subpalette" with most colors.

Regarding forbidding the use of things like ramp(linear($amount,10,500),prism)). I agree that it shouldn't be used, but, I feel like we are putting another obstacle to users. If we throw an error, in this case, we would need to carefully explain what is happening and the user will need to understand much more. I'm worried that the user will end up using some custom palette with zero or low attention to its design just to work around this. Nevertheless, I think that explaining the different types of cartocolors available and applying these good practices in our own examples is very important.

Regarding what I mean by "subpalette". For me, a "subpalette" is just a cartocolor of a specific number of colors, because, in cartocolors, each color scheme has multiple "subpalettes": https://github.com/CartoDB/CartoColor/blob/master/cartocolor.js#L10

@makella
Copy link
Contributor Author

makella commented May 3, 2018

Agree... that's a good way to do it. If people expect what they get with Builder, they'll wonder why they see so many different colors but there are so many Builder users that want a color for all of their categories! So this will accomplish both.

I would prefer to document this issue and its solution: use top or use buckets to reduce the number of categories.

So in Builder what we do is not show the color schemes tagged qualitative for numeric data. I agree it is another roadblock to throw an error.

Regarding forbidding the use of things like ramp(linear($amount,10,500),prism)). I agree that it shouldn't be used, but, I feel like we are putting another obstacle to users. If we throw an error, in this case, we would need to carefully explain what is happening and the user will need to understand much more.

Now that I know how we interpolate when trying to go to more colors, I think the solution is to reorder the color schemes so they are distinguishable from one another in the 11 colors (need for Builder + VL) and do color adjustments so we don't get the muddy colors when interpolated with the ones next to them.

the interpolation when we've reached the maximum colors: does it take the entire 11 color scheme and interpolate between colors that are next to each other?

@rochoa
Copy link
Contributor

rochoa commented Jun 8, 2018

Hey, @makella. This is in the done column for the project. Can we close this issue?

@rochoa rochoa added this to the Public milestone Jun 8, 2018
@makella
Copy link
Contributor Author

makella commented Jun 11, 2018

@rochoa I want to leave it open because it is a larger CARTOColor, category scheme piece of research that needs to happen.

Basically, with VL's built-in interpolation, we have some issues of colors that are adjacent to one another not mixing well. So for example, if VL interpolates the category scheme Vivid to 50 categories, we get muddy colors and that is because of the way the colors in the scheme are organized and the CIE Lab interpolation.

The main thing that I need to spend more time looking at is if the ordering of the category color schemes.... are there combinations that will both be distinguishable between each other for Builder and VL.

The other question is if category schemes that we provide should be interpolated...

I added a label research (hope that's ok) so we can tag instances like this that need more work.

Also, I believe all of this will become a really useful guide/blog post once we have it all figure out.

@makella
Copy link
Contributor Author

makella commented Jun 20, 2018

@davidmanzanares OH! I just thought of something. Tell me what you think.

what if we had a more ordered set of CARTOColor qualitative schemes that we triggered if someone activates interpolation on a category attribute?

So for example, vivid by default would be:
screen shot 2018-06-20 at 11 46 20 am

If interpolation is activated, we could switch to something like vivid_interp (that would be defined in CARTOColors):
screen shot 2018-06-20 at 11 47 37 am

This way, we can still provide the distinction between colors when people use buckets but also provide a more aesthetic representation when someone uses interpolation on a category attribute.

@makella
Copy link
Contributor Author

makella commented Jun 20, 2018

so using the same example as the previous issue investigation, we would go from

this interpolation:
screen shot 2018-06-20 at 11 51 03 am

To this:
screen shot 2018-06-20 at 11 51 16 am

and I could do a little more adjusting in the ordering to see if we could reduce the number of "similarish" colors even though in reality, each one should be perceivable.

@rochoa rochoa removed this from the Public milestone Jul 13, 2018
@makella makella changed the title Category color assignment: interpolation and schemes CARTOColor category schemes and interpolation Jul 25, 2018
@makella makella added the status: discussion Needs more discussion, to start or to be closed / merged label Aug 7, 2018
@rochoa rochoa assigned rochoa and unassigned davidmanzanares Nov 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: cartography status: discussion Needs more discussion, to start or to be closed / merged type: research
Projects
None yet
Development

No branches or pull requests

3 participants