Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue #72

Closed
Goysa2 opened this issue Jun 14, 2018 · 3 comments
Closed

Performance issue #72

Goysa2 opened this issue Jun 14, 2018 · 3 comments

Comments

@Goysa2
Copy link

Goysa2 commented Jun 14, 2018

Hi!
I starting to use Autograd and I have a question concerning the performance of AutoGrad compared to ReverseDiff.jl.

I have this basic setup:

using ReverseDiff, AutoGrad, BenchmarkTools

function f(x)
    m = length(x)
    return 100.0 * sum((x[i] - x[i - 1]^2)^2 for i=2:m) + (1.0 - x[1])^2
end

n = 2
x = [0.150369, 0.8463333]
u = [0.284309, 0.927797]

With ReverseDiff I do:

g = Array{Any}(n)
tape = ReverseDiff.compile(ReverseDiff.GradientTape(f, rand(n)))
F = x -> ReverseDiff.gradient!(g, tape, x)
@benchmark F(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  32 bytes
  allocs estimate:  2
  --------------
  minimum time:     527.174 ns (0.00% GC)
  median time:      539.411 ns (0.00% GC)
  mean time:        547.806 ns (0.19% GC)
  maximum time:     6.466 μs (88.03% GC)
  --------------
  samples:          10000
  evals/sample:     190

And with AutoGrad I do:

gradg = AutoGrad.grad(f, 1)
@benchmark gradg(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  11.31 KiB
  allocs estimate:  299
  --------------
  minimum time:     37.077 μs (0.00% GC)
  median time:      38.893 μs (0.00% GC)
  mean time:        41.757 μs (3.26% GC)
  maximum time:     2.893 ms (95.67% GC)
  --------------
  samples:          10000
  evals/sample:     1

So AutoGrad is much slower then ReverseDiff. I assume it's because I can precompile a tape with ReverseDiff, which makes it faster.

Is it possible to get a similar level of performance using AutoGrad?
Thanks!

@CarloLucibello
Copy link
Collaborator

This has been brought up before (also by miself, see #10 ). I guess that the answer is no, tape compilation for autograd is not gonna happen in the near future. Performance impact for deep learning scenarios are not really relevant (see also pytorch).

We can close this is as a duplicate of #10

@denizyuret
Copy link
Owner

Please also check out Zygote.jl and Capstan.jl for possible alternatives. AutoGrad supports Knet and its overhead is negligible for deep learning models, so no immediate plans on tape compilation. Memory is a big concern though (gpu memory is very limited), any suggestions for efficiency impovement in memory are welcome. I am planning to try some memoization to avoid creating records of repeated operations (e.g. indexing) when I get a chance.

@Goysa2
Copy link
Author

Goysa2 commented Aug 16, 2018

Thanks for the feedback! I am not really into ML or deep learning, I am more interested in AD and it's applications in optimization. I looked into as many AD packages as possible and it seems that the machine learning community has a lot of interest in AD.

I was mainly looking for speed and nested AD, and it seems that nested AD isn't the most popular feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants