Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling NaNs for divergent trajectories in Sobol Method #185

Open
duodenum96 opened this issue Jan 7, 2025 · 0 comments
Open

Handling NaNs for divergent trajectories in Sobol Method #185

duodenum96 opened this issue Jan 7, 2025 · 0 comments

Comments

@duodenum96
Copy link

duodenum96 commented Jan 7, 2025

Is your feature request related to a problem? Please describe.

Following from https://discourse.julialang.org/t/handling-divergent-trajectories-in-global-sensitivity-analysis/123186
I am having the following problem in a large parameter space. Let’s say I have two parameters A and B. If I constrain the range of A between 0 and 10 and B between 0 and 5, the sensitivity to A turns out higher. If I do the inverse, sensitivity to B is higher. If I make both of them between 0 and 10, then the solutions to differential equation diverge. I tried putting NaNs to divergent solutions but this impairs the subsequent calculations of sensitivity analysis since the software is not suited for dealing with NaN values (it returns NaNs when there is a NaN in the input).

Describe the solution you’d like

The solution is probably dropping NaNs.

Describe alternatives you’ve considered

So far I was unable to come up with another alternative.

Additional context

I gave it a try and made some changes in the sobol_sensitivity.jl module. The strategy I used is inspired by the conservative option in statsmodels.tsa.stattools.acf function in python statsmodels package (https://www.statsmodels.org/dev/_modules/statsmodels/tsa/stattools/_stattools.html#acf). The strategy is the following. Take the code

fA = all_y[((i - 1) * n + 1):(i * n)]

and rewrite it as

nan_mask = isnan.(all_y)
nonnan_mask = .!nan_mask
nonnan_idx = findall(nonnan_mask)

indices_fA = ((i - 1) * n + 1):(i * n)
if dropnan
    indices_fA = intersect(indices_fA, nonnan_idx)
end
fA = all_y[indices_fA]

When I rewrite gsa_sobol_all_y_analysis function with this strategy, the test code in sobol_method.jl passes all tests with no problem. Additionally, I wrote some test code that generates NaNs and it seems the algorithm doesn't return NaNs, instead it returns actual sensitivity values. I copy-pasted the gsa_sobol_all_y_analysis and my little test code below. But I honestly don't know if this makes sense or the results I get are correct results.

Below is a rewrite of gsa_sobol_all_y_analysis function. Note that I added a dropnan = false option.

function gsa_sobol_all_y_analysis(method, all_y::AbstractArray{T}, d, n, Ei_estimator,
        y_size, ::Val{multioutput}; dropnan = false) where {T, multioutput}
    if dropnan
        nan_mask = isnan.(all_y)
        nonnan_mask = .!nan_mask
        nonnan_idx = findall(nonnan_mask)
        if sum(nonnan_mask) == 0
            throw(ArgumentError("All values are NaN"))
        end
    end

    nboot = method.nboot
    Eys = multioutput ? Matrix{T}[] : T[]
    Varys = multioutput ? Matrix{T}[] : T[]
    Vᵢs = multioutput ? Matrix{T}[] : Vector{T}[]
    Vᵢⱼs = multioutput ? Array{T, 3}[] : Matrix{T}[]
    Eᵢs = multioutput ? Matrix{T}[] : Vector{T}[]
    step = 2 in method.order ? 2 * d + 2 : d + 2
    if !multioutput
        for i in 1:step:(step * nboot)
            indices = ((i - 1) * n + 1):((i + 1) * n)
            if dropnan
                indices = intersect(indices, nonnan_idx)
            end
            push!(Eys, mean(all_y[indices]))
            push!(Varys, var(all_y[indices]))

            indices_fA = ((i - 1) * n + 1):(i * n)
            indices_fB = (i * n + 1):((i + 1) * n)
            if dropnan
                indices_fA = intersect(indices_fA, nonnan_idx)
                indices_fB = intersect(indices_fB, nonnan_idx)
            end

            fA = all_y[indices_fA]
            fB = all_y[indices_fB]
            fAⁱ = []
            for j in (i + 1):(i + d)
                indices_fAⁱ = (j * n + 1):((j + 1) * n)
                if dropnan
                    indices_fAⁱ = intersect(indices_fAⁱ, nonnan_idx)
                end
                push!(fAⁱ, all_y[indices_fAⁱ])
            end
            
            if 2 in method.order
                fBⁱ = []
                for j in (i + d + 1):(i + 2 * d)
                    indices_fBⁱ = (j * n + 1):((j + 1) * n)
                    if dropnan
                        indices_fBⁱ = intersect(indices_fBⁱ, nonnan_idx)
                    end
                    push!(fBⁱ, all_y[indices_fBⁱ])
                end
            end

            push!(Vᵢs, [sum(fB .* (fAⁱ[k] .- fA)) for k in 1:d] ./ n)
            if 2 in method.order
                M = zeros(T, d, d)
                for k in 1:d
                    for j in (k + 1):d
                        M[k, j] = sum((fBⁱ[k] .* fAⁱ[j]) .- (fA .* fB)) / n
                    end
                end
                push!(Vᵢⱼs, M)
            end
            if Ei_estimator === :Homma1996
                push!(Eᵢs,
                    [sum((fA .- (sum(fA) ./ n)) .^ 2) ./ (n - 1) .-
                     sum(fA .* fAⁱ[k]) ./ (n) + (sum(fA) ./ n) .^ 2 for k in 1:d])
            elseif Ei_estimator === :Sobol2007
                push!(Eᵢs, [sum(fA .* (fA .- fAⁱ[k])) for k in 1:d] ./ (n))
            elseif Ei_estimator === :Jansen1999
                push!(Eᵢs, [sum(abs2, fA - fAⁱ[k]) for k in 1:d] ./ (2n))
            elseif Ei_estimator === :Janon2014
                push!(Eᵢs,
                    [(sum(fA .^ 2 + fAⁱ[k] .^ 2) ./ (2n) .-
                      (sum(fA + fAⁱ[k]) ./ (2n)) .^ 2) * (1.0 .-
                      (1 / n .* sum(fA .* fAⁱ[k])
                       .-
                       (1 / n .* sum((fA .+ fAⁱ[k]) ./ 2)) .^ 2) ./
                      (1 / n .* sum((fA .^ 2 .+ fAⁱ[k] .^ 2) ./ 2) -
                       (1 / n .* sum((fA .+ fAⁱ[k]) ./ 2)) .^ 2))
                     for k in 1:d])
            end
        end
    else # multioutput case
        for i in 1:step:(step * nboot)
            indices = ((i - 1) * n + 1):((i + 1) * n)
            if dropnan
                indices = intersect(indices, nonnan_idx)
            end

            push!(Eys, mean(all_y[:, indices], dims = 2))
            push!(Varys, var(all_y[:, indices], dims = 2))

            indices_fA = ((i - 1) * n + 1):(i * n)
            indices_fB = (i * n + 1):((i + 1) * n)

            if dropnan
                indices_fA = intersect(indices_fA, nonnan_idx)
                indices_fB = intersect(indices_fB, nonnan_idx)
            end

            fA = all_y[:, indices_fA]
            fB = all_y[:, indices_fB]
            fAⁱ = []
            for j in (i + 1):(i + d)
                indices = (j * n + 1):((j + 1) * n)
                if dropnan
                    indices = intersect(indices, nonnan_idx)
                end
                push!(fAⁱ, all_y[:, indices])
            end

            if 2 in method.order
                fBⁱ = []
                for j in (i + d + 1):(i + 2 * d)
                    indices = (j * n + 1):((j + 1) * n)
                    if dropnan
                        indices = intersect(indices, nonnan_idx)
                    end
                    push!(fBⁱ, all_y[:, indices])
                end
            end

            push!(Vᵢs,
                reduce(hcat, [sum(fB .* (fAⁱ[k] .- fA), dims = 2) for k in 1:d] ./ n))

            if 2 in method.order
                M = zeros(T, d, d, length(Eys[1]))
                for k in 1:d
                    for j in (k + 1):d
                        Vₖⱼ = sum((fBⁱ[k] .* fAⁱ[j]) .- (fA .* fB), dims = 2) / n
                        for l in 1:length(Eys[1])
                            M[k, j, l] = Vₖⱼ[l]
                        end
                    end
                end
                push!(Vᵢⱼs, M)
            end
            if Ei_estimator === :Homma1996
                push!(Eᵢs,
                    reduce(hcat,
                        [sum((fA .- (sum(fA, dims = 2) ./ n)) .^ 2, dims = 2) ./ (n - 1) .-
                         sum(fA .* fAⁱ[k], dims = 2) ./ (n) + (sum(fA, dims = 2) ./ n) .^ 2
                         for k in 1:d]))
            elseif Ei_estimator === :Sobol2007
                push!(Eᵢs,
                    reduce(hcat,
                        [sum(fA .* (fA .- fAⁱ[k]), dims = 2) for k in 1:d] ./ (n)))
            elseif Ei_estimator === :Jansen1999
                push!(Eᵢs,
                    reduce(hcat, [sum(abs2, fA - fAⁱ[k], dims = 2) for k in 1:d] ./ (2n)))
            elseif Ei_estimator === :Janon2014
                push!(Eᵢs,
                    reduce(hcat,
                        [(sum(fA .^ 2 + fAⁱ[k] .^ 2, dims = 2) ./ (2n) .-
                          (sum(fA + fAⁱ[k], dims = 2) ./ (2n)) .^ 2) .* (1.0 .-
                          (1 / n .* sum(fA .* fAⁱ[k], dims = 2) .-
                           (1 / n * sum((fA .+ fAⁱ[k]) ./ 2, dims = 2)) .^ 2) ./
                          (1 / n .* sum((fA .^ 2 .+ fAⁱ[k] .^ 2) ./ 2, dims = 2) .-
                           (1 / n * sum((fA .+ fAⁱ[k]) ./ 2, dims = 2)) .^ 2)) for k in 1:d]))
            end
        end
    end
    if 2 in method.order
        Sᵢⱼs = similar(Vᵢⱼs)
        for i in 1:nboot
            if !multioutput
                M = zeros(T, d, d)
                for k in 1:d
                    for j in (k + 1):d
                        M[k, j] = Vᵢs[i][k] + Vᵢs[i][j]
                    end
                end
                Sᵢⱼs[i] = (Vᵢⱼs[i] - M) ./ Varys[i]
            else
                M = zeros(T, d, d, length(Eys[1]))
                for l in 1:length(Eys[1])
                    for k in 1:d
                        for j in (k + 1):d
                            M[k, j, l] = Vᵢs[i][l, k] + Vᵢs[i][l, j]
                        end
                    end
                end
                Sᵢⱼs[i] = cat(
                    [(Vᵢⱼs[i][:, :, l] - M[:, :, l]) ./ Varys[i][l]
                     for l in 1:length(Eys[1])]...;
                    dims = 3)
            end
        end
    end

    Sᵢs = [Vᵢs[i] ./ Varys[i] for i in 1:nboot]
    Tᵢs = [Eᵢs[i] ./ Varys[i] for i in 1:nboot]
    if nboot > 1
        size_ = size(Sᵢs[1])
        S1 = [[Sᵢ[i] for Sᵢ in Sᵢs] for i in 1:length(Sᵢs[1])]
        ST = [[Tᵢ[i] for Tᵢ in Tᵢs] for i in 1:length(Tᵢs[1])]

        function calc_ci(x, mean = nothing)
            alpha = (1 - method.conf_level)
            std(x, mean = mean) / sqrt(length(x))
        end
        S1_CI = map(calc_ci, S1)
        ST_CI = map(calc_ci, ST)

        if 2 in method.order
            size__ = size(Sᵢⱼs[1])
            S2_CI = Array{T}(undef, size__)
            Sᵢⱼ = Array{T}(undef, size__)
            b = getindex.(Sᵢⱼs, 1)
            Sᵢⱼ[1] = b̄ = mean(b)
            S2_CI[1] = calc_ci(b, b̄)
            for i in 2:length(Sᵢⱼs[1])
                b .= getindex.(Sᵢⱼs, i)
                Sᵢⱼ[i] = b̄ = mean(b)
                S2_CI[i] = calc_ci(b, b̄)
            end
        end
        Sᵢ = reshape(mean.(S1), size_...)
        Tᵢ = reshape(mean.(ST), size_...)
    else
        Sᵢ = Sᵢs[1]
        Tᵢ = Tᵢs[1]
        if 2 in method.order
            Sᵢⱼ = Sᵢⱼs[1]
        end
    end
    if isnothing(y_size)
        _Sᵢ = Sᵢ
        _Tᵢ = Tᵢ
    else
        f_shape = let y_size = y_size
            x -> [reshape(x[:, i], y_size) for i in 1:size(x, 2)]
        end
        _Sᵢ = f_shape(Sᵢ)
        _Tᵢ = f_shape(Tᵢ)
    end
    return SobolResult(_Sᵢ,
        nboot > 1 ? reshape(S1_CI, size_...) : nothing,
        2 in method.order ? Sᵢⱼ : nothing,
        nboot > 1 && 2 in method.order ? S2_CI : nothing,
        _Tᵢ,
        nboot > 1 ? reshape(ST_CI, size_...) : nothing)
end

And the following is a small test code that generates NaNs and runs gsa.

# Tests for nan handling
# Make a differential equation problem that can be divergent. 

using Revise, Test, OrdinaryDiffEq, GlobalSensitivity, Plots, QuasiMonteCarlo, Statistics

function f(du, u, p, t)
    du .= p[1] * u
    return du
end

# Single output case

u0 = [1.0]
p = [50] # diverges
tspan = (0.0, 20.0)
t = collect(range(0, stop = 10, length = 200))
prob = ODEProblem(f, u0, tspan, p)

f1 = let prob = prob, t = t
    function (p)
        prob1 = remake(prob; p = p)
        sol = solve(prob1, Tsit5(); saveat = t)
        if sol.retcode == ReturnCode.Success
            return mean(Array(sol))
        else
            return NaN
        end
    end
end

samples = 100
lb = [20]
ub = [70]
sampler = SobolSample()
A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler)

m = gsa(f1, Sobol(), A, B; dropnan = true)
@test !any(isnan.(m.S1))
@test !any(isnan.(m.ST))

# Multi output case

f1 = let prob = prob, t = t
    function (p)
        prob1 = remake(prob; p = p)
        sol = solve(prob1, Tsit5(); saveat = t)
        if sol.retcode == ReturnCode.Success
            return [mean(Array(sol)), std(Array(sol))]
        else
            return [NaN, NaN]
        end
    end
end

m = gsa(f1, Sobol(), A, B; dropnan = true)

@test !any(isnan.(m.S1))
@test !any(isnan.(m.ST))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant