Performance issue #72

Goysa2 · 2018-06-14T14:14:59Z

Hi!
I starting to use Autograd and I have a question concerning the performance of AutoGrad compared to ReverseDiff.jl.

I have this basic setup:

using ReverseDiff, AutoGrad, BenchmarkTools

function f(x)
    m = length(x)
    return 100.0 * sum((x[i] - x[i - 1]^2)^2 for i=2:m) + (1.0 - x[1])^2
end

n = 2
x = [0.150369, 0.8463333]
u = [0.284309, 0.927797]

With ReverseDiff I do:

g = Array{Any}(n)
tape = ReverseDiff.compile(ReverseDiff.GradientTape(f, rand(n)))
F = x -> ReverseDiff.gradient!(g, tape, x)
@benchmark F(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  32 bytes
  allocs estimate:  2
  --------------
  minimum time:     527.174 ns (0.00% GC)
  median time:      539.411 ns (0.00% GC)
  mean time:        547.806 ns (0.19% GC)
  maximum time:     6.466 μs (88.03% GC)
  --------------
  samples:          10000
  evals/sample:     190

And with AutoGrad I do:

gradg = AutoGrad.grad(f, 1)
@benchmark gradg(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  11.31 KiB
  allocs estimate:  299
  --------------
  minimum time:     37.077 μs (0.00% GC)
  median time:      38.893 μs (0.00% GC)
  mean time:        41.757 μs (3.26% GC)
  maximum time:     2.893 ms (95.67% GC)
  --------------
  samples:          10000
  evals/sample:     1

So AutoGrad is much slower then ReverseDiff. I assume it's because I can precompile a tape with ReverseDiff, which makes it faster.

Is it possible to get a similar level of performance using AutoGrad?
Thanks!

The text was updated successfully, but these errors were encountered:

CarloLucibello · 2018-08-15T16:05:39Z

This has been brought up before (also by miself, see #10 ). I guess that the answer is no, tape compilation for autograd is not gonna happen in the near future. Performance impact for deep learning scenarios are not really relevant (see also pytorch).

We can close this is as a duplicate of #10

denizyuret · 2018-08-16T01:35:09Z

Please also check out Zygote.jl and Capstan.jl for possible alternatives. AutoGrad supports Knet and its overhead is negligible for deep learning models, so no immediate plans on tape compilation. Memory is a big concern though (gpu memory is very limited), any suggestions for efficiency impovement in memory are welcome. I am planning to try some memoization to avoid creating records of repeated operations (e.g. indexing) when I get a chance.

Goysa2 · 2018-08-16T18:02:02Z

Thanks for the feedback! I am not really into ML or deep learning, I am more interested in AD and it's applications in optimization. I looked into as many AD packages as possible and it seems that the machine learning community has a lot of interest in AD.

I was mainly looking for speed and nested AD, and it seems that nested AD isn't the most popular feature.

denizyuret closed this as completed Aug 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue #72

Performance issue #72

Goysa2 commented Jun 14, 2018

CarloLucibello commented Aug 15, 2018

denizyuret commented Aug 16, 2018

Goysa2 commented Aug 16, 2018

Performance issue #72

Performance issue #72

Comments

Goysa2 commented Jun 14, 2018

CarloLucibello commented Aug 15, 2018

denizyuret commented Aug 16, 2018

Goysa2 commented Aug 16, 2018