Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confidence Interval Band #366

Open
kir0ul opened this issue Mar 2, 2022 · 16 comments
Open

Confidence Interval Band #366

kir0ul opened this issue Mar 2, 2022 · 16 comments

Comments

@kir0ul
Copy link

kir0ul commented Mar 2, 2022

Problem description

I would like to be able to plot a line with a confidence interval band, something like the plot below (which I believe is not possible at the moment using only AlgebraOfGraphics.jl alone).

image

Proposed solution

I guess the implementation could use the band() function from Makie, and borrow some ideas for the API from VegaLite.jl or ggplot2.

@greimel
Copy link
Collaborator

greimel commented Mar 2, 2022

linefill() does that http://juliaplots.org/AlgebraOfGraphics.jl/stable/API/recipes/#AlgebraOfGraphics.linesfill

Unfortunately it is not covered in the docs yet.

@kir0ul
Copy link
Author

kir0ul commented Mar 15, 2022

Thanks! I've been able to use it on a single line plot but I'm struggling to use it on a plot with multiple lines and keep the same colors.
Would you have any hint or example on how to use linesfill() with multiple line using AlgebraOfGraphics syntax? Or maybe I should switch to plain Makie syntax?

I need to do something like the figure below.
image

And for now my code is looking like the following:

using DataFrames
using Random
using HypothesisTests
using Statistics
using AlgebraOfGraphics, CairoMakie
set_aog_theme!()

param_categories = [1, 2, 3, 4]

function get_data()
    N = 100
    n_runs = 10
    f(x, p) = exp.(x .+ p) ./ sum(exp.(x .+ p)) .+ rand(1) # some example function
    df_full = DataFrame(x = Int[], y = Float64[], param = Int[], run_nb = Int[])
    for (ip, param) in enumerate(param_categories)
        for idx = 1:n_runs
            x = collect(1:N)
            y = f(x, param)
            params = fill(param_categories[ip], N)
            run_nb = fill(idx, N)
            for jdx = 1:N
                push!(df_full, (x[jdx], y[jdx], params[jdx], run_nb[jdx]))
            end
        end
    end

    df_avg = DataFrame(x = Int[], y_avg = Float64[], param = Int[])
    lower, upper = zeros(length(param_categories), N), zeros(length(param_categories), N)
    for (ip, param) in enumerate(param_categories)
        y = zeros(N)
        for jdx = 1:N
            y[jdx] = mean(df_full.y[df_full.x.==jdx.&&df_full.param.==param, :])
            lower[ip, jdx], upper[ip, jdx] = confint(
                OneSampleTTest(vec(df_full.y[df_full.x.==jdx.&&df_full.param.==param, :])),
            )
        end
        x = collect(1:N)
        params = fill(param_categories[ip], N)
        for jdx = 1:N
            push!(df_avg, (x[jdx], y[jdx], params[jdx]))
        end
    end
    return df_full, df_avg, lower, upper
end


function plotting()
    df_full, df_avg, lower, upper = get_data()
    plt =
        data(df_avg) *
        mapping(:x, :y_avg => "y", color = :param => nonnumeric) *
        visual(Lines)
    fg = draw(plt)

    # Confidence interval band
    for (ip, param) in enumerate(param_categories)
        linesfill!(
            vec(df_avg.x[df_avg.param.==param, :]),
            vec(df_avg.y_avg[df_avg.param.==param, :]);
            lower = lower[ip, :],
            upper = upper[ip, :],
        )
    end
    return fg
end

fn = "My-awesome-figure.png"
save("$fn", plotting())
cmd = `xdg-open "$fn"`
run(cmd)

@greimel
Copy link
Collaborator

greimel commented Mar 16, 2022

Can you add some fake data to make your example runnable? Then it will be easier to help.

@kir0ul
Copy link
Author

kir0ul commented Mar 17, 2022

Can you add some fake data to make your example runnable? Then it will be easier to help.

Yes sorry, I updated the code above which you should now be able to run fully.

@Nosferican
Copy link

I am not sure of the difference between linesfill and band but would was able to get what I needed in Makie using band. Would be amazing to have that feature available in AoG. It's a killer feature in VegaLite and I use a lot so would love to have that available in Makie/AoG as well. This is an example,

using DataFrames, CairoMakie, AlgebraOfGraphics

df = DataFrame(x = 0:5, y = 0:5, lb = .√(0:5), ub = (0:5).^2)

f = Figure()
Axis(f[1, 1])

band!(
    df[!,:x],
    df[!,:lb],
    df[!,:ub],
    color = RGBAf(0.1, 0.1, 0.1, 0.05))
lines!(
    df[!,:x],
    df[!,:y],
    color = "black")
f

plot_51

@greimel
Copy link
Collaborator

greimel commented Mar 21, 2022

@Nosferican, @kir0ul here's how you would do it with AoG:

using AlgebraOfGraphics, CairoMakie
using DataFrames: DataFrame
using Chain: @chain
using DataFrameMacros: @transform

df = @chain begin
	DataFrame(x = [0:5; 0:5], sample = [fill(1, 6); fill(2, 6)])
	@transform(:y = :x + :sample)
	@transform(:lb = (:y), :ub = :y^2)
end

data(df) * visual(LinesFill) * mapping(
	:x, :y, 
	lower = :lb, upper = :ub,
	color = :sample => nonnumeric,
	layout = :sample => nonnumeric
) |> draw

image

(@Nosferican, thanks for providing a more minimal example.)

@Nosferican
Copy link

It may make sense to allow for a method w/o having to specify the y and just the lower and upper.

@greimel
Copy link
Collaborator

greimel commented Apr 12, 2022

That would be visual(Band).

@ParadaCarleton
Copy link

ParadaCarleton commented May 30, 2022

@Nosferican, @kir0ul here's how you would do it with AoG:

Is there a tutorial on how to do this with multiple bands? Looking to get a fan chart, e.g. like this:
image

@greimel
Copy link
Collaborator

greimel commented May 30, 2022

I wanted to implement this at some point but I didn't finish. I've just put the code into a draft PR greimel/AoGExtensions.jl#16 so that you have a look.

Here's an example that works already.

image

[Data]
raw_df = mapreduce(vcat, 1:500) do i
	T = 100
	drift = rand([-0.2, 0, 0.2])
	x = 1:T
	
	y = [randn()]
	for t  1:T-1
		push!(y, drift + y[end] + randn())
	end

	DataFrame(; x, y, i, drift)
end

What I still need to add is customization of breakpoints (these are currently hardcoded)

@ParadaCarleton
Copy link

ParadaCarleton commented May 30, 2022

This is great! I actually managed to get my own version almost working:

fan_chart = visual(LinesFill; color=:red) * (
    mapping(house_pop...; lower="5.0%", upper="95.0%") +
    mapping(house_pop...; lower="12.0%", upper="88.0%") +
    mapping(house_pop...; lower="27.0%", upper="73.0%")
) + visual(Lines; color=:black) * mapping(house_pop...)

The reason I say "Almost" is because the red lines (not the band) are visible from behind the black line. I'd also like to add a legend saying which interval is which. I can't get Band to work; it seems to use a different syntax from LinesFill.

@ParadaCarleton
Copy link

Ahh, I figured it out!

fan_chart = visual(Band; color=(:red, .3)) * (
    mapping(:year, "5.0%", "95.0%") +
    mapping(:year, "12.0%", "88.0%") +
    mapping(:year, "27.0%", "73.0%")
) + visual(Lines) * mapping(:year, "50.0%")

@ParadaCarleton
Copy link

ParadaCarleton commented Jun 3, 2022

I wanted to implement this at some point but I didn't finish. I've just put the code into a draft PR greimel/AoGExtensions.jl#16 so that you have a look.

Here's an example that works already.

[Data]
What I still need to add is customization of breakpoints (these are currently hardcoded)

Actually, do you know how I'd add a legend to this that tells users what each band means? (e.g. light yellow=68%, dark yellow = 95%, very dark = 99%)

@ParadaCarleton
Copy link

ParadaCarleton commented Jun 3, 2022

Also, would a continuous fan chart like this one be possible?
image

@ParadaCarleton
Copy link

Opened a new issue for fan charts here.

@greimel
Copy link
Collaborator

greimel commented Jun 14, 2022

Actually, do you know how I'd add a legend to this that tells users what each band means? (e.g. light yellow=68%, dark yellow = 95%, very dark = 99%)

The goal is to make this syntax work (note the alpha keyword).

visual(Band) * mapping(:year, :low, :high, alpha = :quantile)

At some point I was working a PR to achieve that, but I didn't finish it. I hope to be able to get back to this at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants