Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annotation layers for thickness and dots scales #183

Closed
mjskay opened this issue May 26, 2023 · 11 comments
Closed

annotation layers for thickness and dots scales #183

mjskay opened this issue May 26, 2023 · 11 comments

Comments

@mjskay
Copy link
Owner

mjskay commented May 26, 2023

Pinging off of #182, it occurred to me a solution for this would be to add a layer that is capable of adding subscale axis labels for thickness and dots geoms. It would be like a legend, but drawn directly on the chart.

This requires knowing geom settings and data from a slab or dots geom, so this would probably have to be tied to the geom. I initially thought a separate layer makes sense, but perhaps an option on a slab is more sensible, because of the inherent ties to the normalization settings of the geom (and, in the case of dots, it would have to be computed after binwidth is determined by the grob, so can't be on a separate layer at all). Something like stat_slab(..., thickness_guide = ...) or stat_slab(..., subaxis = ...) or stat_slab(..., subguide = ...) ...

@higgi13425
Copy link

I think that this could work, and would be helpful.
But it needs to be robust to missing/NA values in x, as in palmerpenguins'
penguins$bill_length_mm
apparently 2 penguins were not very cooperative with bill measurement that day.

This works as a very simple version (without being robust to missingness in x)
library(tidyverse)
library(ggdist)
set.seed(1234)
x = rnorm(100)

binwidth = find_dotplot_binwidth(na.omit(x), maxheight = 2/3*diff(range(x, na.rm = TRUE)), heightratio = 1)

bin_df = bin_dots(x = x, y = 0, binwidth = binwidth, heightratio = 1)

bin_df %>%
ggplot(aes(x0 = x, y0 = y / binwidth, a = binwidth/2, b = 1/2, angle = 0)) +
ggforce::geom_ellipse(fill = "gray") +
coord_fixed(ratio = binwidth) +
ylab("Count") +
xlab("Bill Length in mm") +
labs(title ="Count Histogram of Penguin Bill Length in mm") +
theme_classic()

@higgi13425
Copy link

This version seems to be robust to missingness in x

library(tidyverse)
library(ggdist)
set.seed(1234)
x = rnorm(100)

x = penguins$bill_length_mm

binwidth = find_dotplot_binwidth(na.omit(x), maxheight = 2/3*diff(range(x, na.rm = TRUE)), heightratio = 1)

bin_df = bin_dots(x = na.omit(x), y = 0, binwidth = binwidth, heightratio = 1)

bin_df %>%
ggplot(aes(x0 = x, y0 = y / binwidth, a = binwidth/2, b = 1/2, angle = 0)) +
ggforce::geom_ellipse(fill = "gray") +
coord_fixed(ratio = binwidth) +
ylab("Count") +
xlab("Bill Length in mm") +
labs(title ="Count Histogram of Penguin Bill Length in mm") +
theme_classic()

@mjskay
Copy link
Owner Author

mjskay commented Aug 24, 2023

Thanks, this will be helpful for updating the docs of bin_dots / find_dotplot_binwidth to help other folks with this problem!

@mjskay
Copy link
Owner Author

mjskay commented Jan 12, 2024

If anyone (@ASKurz ?) is interested in trying this out, there is now a prototype implementation of what I am provisionally calling "sub-guides" for annotating thickness and dot counts. You can test it on the "subguide" branch via:

remotes::install_github("mjskay/ggdist@subguide")

Some examples:

library(ggplot2)
library(ggdist)
library(distributional)

df = data.frame(
  x = c(dist_gamma(1:2,1:2), dist_normal(2:3,0.75)),
  group = c("a","a","b","b"),
  subgroup = c("d","e","d","e")
)

df |>
  ggplot(aes(xdist = x, y = group, fill = subgroup)) +
  stat_dots(subguide = "count", position = "dodge", color = NA, justification = 0.5, quantiles = 50)

image

df |>
  ggplot(aes(xdist = x, y = group, fill = subgroup)) +
  stat_dots(
    subguide = subguide_count(title = "count", label_side = "left"), 
    position = "dodgejust", 
    color = NA, 
    quantiles = 50, 
    height = 0.91
  ) +
  scale_x_continuous(expand = expansion(add = 0.6))

image

df |>
  ggplot(aes(xdist = x, y = group, fill = subgroup)) +
  stat_slabinterval(
    subguide = subguide_axis(label_side = "outside", title = "density"), 
    position = "dodgejust", 
    height = 0.9,
    scale = 0.9,
    side = "top"
  ) +
  scale_x_continuous(expand = expansion(add = 1))

image

df |>
  ggplot(aes(xdist = x, y = group, fill = subgroup)) +
  stat_slabinterval(
    subguide = subguide_outside(title = "density"), 
    position = "dodgejust", 
    height = 0.9,
    scale = 0.8,
    side = "top",
    normalize = "groups"
  ) +
  scale_y_discrete(breaks = NULL) +
  ylab(NULL) +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50))

image

Positioning can be a bit finicky, but I'm not sure there's any way to make that easier without fundamental changes to ggplot2 (see e.g. tidyverse/ggplot2#5609)

@ASKurz
Copy link

ASKurz commented Jan 12, 2024

Thanks for the heads up @mjskay. I bet @kruschke would like this.

@ASKurz
Copy link

ASKurz commented Jan 12, 2024

But anyways, so far I really like what I'm seeing.

@steveharoz
Copy link

So I guess this issue is independent from using similar scales across facets?

Notice the different y-axis scales in the left and right facets

expand_grid(
  group = c("a","a","b"),
  subgroup = c("d","d","e"),
  reps = 1:50
) %>% 
  mutate(x = rnorm(n(), group=="a", 1+(subgroup == "d"))) %>% 
  ggplot(aes(x = x, fill = subgroup)) +
  ggdist::geom_dots(
    subguide = ggdist::subguide_count(title = "count", label_side = "left"), 
    position = "dodgejust", 
    color = NA, 
    height = 0.91
  ) +
  scale_x_continuous(expand = expansion(add = 0.6)) +
  facet_grid(cols = vars(group))

image

@mjskay
Copy link
Owner Author

mjskay commented Jan 12, 2024

Yeah the faceting issue is separate unfortunately; trickier to address. See #191.

@ASKurz
Copy link

ASKurz commented Jan 12, 2024

I think the faceting issue would also apply to my use cases.

@mjskay
Copy link
Owner Author

mjskay commented Jan 12, 2024

For faceting, if the chart isn't dynamic you can just choose a binwidth manually and then everything should line up --- the inconsistency is caused by the automatic binwidth algorithm picking different binwidths in different charts.

@mjskay
Copy link
Owner Author

mjskay commented Feb 9, 2024

no complaints so far, so this is on master now and will be in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants