Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when repeatedly benchmarking #339

Open
LilithHafner opened this issue Oct 21, 2023 · 5 comments
Open

Memory leak when repeatedly benchmarking #339

LilithHafner opened this issue Oct 21, 2023 · 5 comments
Labels

Comments

@LilithHafner
Copy link
Contributor

I'm trying to benchmark sorting runtime as a function of input size. I have this function

using BenchmarkTools
function f(n)
    x = rand(Int, n)
    target = sort(x)
    y = copy(target)
    @belapsed sort!($y) setup=($y == $target || error("Bad sort"); copyto!($y, $x)) evals=1 gctrial=false samples=3
end

But that function is a significant memory leak (because of JuliaLang/julia#14495)

If I run times = f.(1594323:1594323+100) (100 data points), it leaks about 3.56 GB according to my OS's report of how much memory the Julia process is using. Repeated runs continue to leak until my system crashes.

Is there a way to run @belapsed without leaking memory?

@gdalle
Copy link
Collaborator

gdalle commented Oct 22, 2023

I am completely incapable of answering the question, so I'm hoping someone else will

@willow-ahrens
Copy link
Collaborator

could you copy the arguments into a WeakRef here to avoid the issue? I know that the compiler wouldn't be able to const-prop the weakref'ed args, but I also think that not leaking and constant-propagating might be mutually exclusive

@willow-ahrens
Copy link
Collaborator

we could also use a ref or a length-1 vector, to make it inferrable. I'm not sure if there's a method to empty a Ref to gc it later

@LilithHafner
Copy link
Contributor Author

This leak happens even if we only interpolate an integer instead of a vector:

julia> using BenchmarkTools

julia> function g(n)
           @belapsed begin 
               x = rand()
               for i in 1:$n
                   x = hash(x)
               end
               x
           end evals=1 gctrial=false samples=1
       end
g (generic function with 1 method)

julia> # VIRT = 1.2g

julia> @time g.(1:1000);
 12.504336 seconds (15.92 M allocations: 997.643 MiB, 0.51% gc time, 88.27% compilation time)

julia> # VIRT = 1.5g

julia> @time g.(1:1000);
 12.604113 seconds (15.61 M allocations: 976.790 MiB, 0.54% gc time, 88.13% compilation time)

julia> # VIRT = 1.6g

julia> @time g.(1:1000);
 12.597098 seconds (15.61 M allocations: 976.055 MiB, 0.57% gc time, 87.90% compilation time)

julia> # VIRT = 1.8g

julia> @time g.(1:1000);
 12.814114 seconds (15.61 M allocations: 976.578 MiB, 0.60% gc time, 87.95% compilation time)

julia> # VIRT = 1.9g

julia> GC.gc()

julia> # VIRT = 1.9g

julia> @time g.(1:1000);
 12.808911 seconds (15.61 M allocations: 977.279 MiB, 0.53% gc time, 87.71% compilation time)

julia> # VIRT = 2.0g

The VERT comments refer to virtual memory usage of the Julia process according to top, running in a separate process.

This is because generated code itself is leaked, not just interpolated values constpropped into that code.

@willow-ahrens
Copy link
Collaborator

yes, generated code itself will never be deallocated, but if you had benchmarks that reference very large matrices, for example, at least this would avoid keeping the matrices around. I thought I'd leave this approach here because it is helpful for those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants