Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HartleyRao for SurveyDesign #227

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/Survey.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ include("boxplot.jl")
include("show.jl")
include("ratio.jl")
include("by.jl")
include("hr.jl")

export load_data
export AbstractSurveyDesign, SurveyDesign, ReplicateDesign
Expand All @@ -35,5 +36,6 @@ export hist, sturges, freedman_diaconis
export boxplot
export bootweights
export ratio
export HartleyRao

end
10 changes: 10 additions & 0 deletions src/hr.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""
Hartley Rao Approximation of the variance of the Horvitz-Thompson Estimator.
Avoids exhaustively calculating joint (inclusion) probabilities πᵢⱼ of the sampling scheme.
"""
function HartleyRao(y::AbstractVector, design::SurveyDesign, HT_total)
πᵢ = design.data[!,design.allprobs]
hartley_rao_var = sum((1 .- ((design.data[!,design.sampsize] .- 1) ./ design.data[!,design.sampsize]) .* πᵢ) .*
((y .* design.data[!,design.weights]) .- (HT_total ./ design.data[!,design.sampsize])) .^ 2)
return hartley_rao_var
end
3 changes: 2 additions & 1 deletion src/total.jl
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ julia> total(:api00, dclus1)
"""
function total(x::Symbol, design::SurveyDesign)
X = wsum(design.data[!, x], weights(design.data[!, design.weights]))
DataFrame(total = X)
variance = HartleyRao(design.data[!, x], design, X)
DataFrame(total = X, SE = sqrt(variance))
end

"""
Expand Down
12 changes: 12 additions & 0 deletions test/hr.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
@testset "hartley_rao" begin
### I havent been able to reproduce below in R survey, which give somewhat different SE estimates
# base functionality SRS
tot = total(:api00, srs)
@test tot.SE[1] ≈ 57154.1 rtol = STAT_TOL
# Stratified
tot = total(:api00, dstrat)
@test tot.SE[1] ≈ 699405 rtol = STAT_TOL
# one stage cluster
tot = total(:api00, dclus1)
@test tot.SE[1] ≈ 4880240 rtol = STAT_TOL
end
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,4 @@ include("hist.jl")
include("boxplot.jl")
include("ratio.jl")
include("show.jl")
include("hr.jl")