Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First test scenarios for Flux gradients #352

Merged
merged 22 commits into from
Jul 17, 2024

Conversation

nialamarcotte
Copy link
Contributor

Work done during JuliaCon Hackathon; preparing test for neural networks

@gdalle
Copy link
Member

gdalle commented Jul 13, 2024

Thanks so much for your contribution! It may take a few days but I'll get back to you, give you some feedback on the code and take it from here if that's okay. I'm sorry that the task was so daunting, but very grateful that you agreed to give it a try :) And if you need any more resources on AD, let me know

@gdalle gdalle linked an issue Jul 15, 2024 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Jul 16, 2024

Codecov Report

Attention: Patch coverage is 98.23009% with 2 lines in your changes missing coverage. Please review.

Project coverage is 96.58%. Comparing base (63fa856) to head (ceeb25f).
Report is 2 commits behind head on main.

Files Patch % Lines
DifferentiationInterfaceTest/src/utils/misc.jl 71.42% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #352      +/-   ##
==========================================
+ Coverage   96.57%   96.58%   +0.01%     
==========================================
  Files          98       99       +1     
  Lines        4813     4888      +75     
==========================================
+ Hits         4648     4721      +73     
- Misses        165      167       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gdalle gdalle changed the title Flux-Zygote : creating functions for test scenario First test scenarios for Flux gradients Jul 16, 2024
@gdalle
Copy link
Member

gdalle commented Jul 16, 2024

CI

  • Adjust groups

DIT source

  • Support custom isequal function in correctness tests, in addition to isapprox
  • Adjust mycopy_random and mysimilar for arbitrary structures with Functors

DIT extensions

DIT tests

  • Test new Flux scenarios with Zygote

DI extensions

  • In the Enzyme extension, support non-array inputs in reverse mode:
    • Relax type constraints
    • Avoid querying eltype(x), backpropagate dy=true instead

DI tests

  • Test new Flux scenarios with Zygote (Enzyme fails)

@gdalle
Copy link
Member

gdalle commented Jul 16, 2024

@CarloLucibello I got the first batch of tests working with Zygote. The syntax is a bit counter-intuitive because DI doesn't yet support multiple arguments, so I have to put the data x inside the loss closure, but it gets the job done for now.
The Enzyme tests error on two cases though, the ones involving Flux.Scale and Flux.SkipConnection. Any idea what could be different about those two? @wsmoses could this be one of the cases where we want to Duplicate the function f itself for the time being?
What other backends do we want to test here? Tracker?

@wsmoses
Copy link

wsmoses commented Jul 16, 2024

Actually this is probably a case of exactly the opposite in which doing so would be harmful. Marking the closure as differentiable would mean that Enzyme would also end up computing the derivative of the input [and then throwing it away].

@wsmoses
Copy link

wsmoses commented Jul 16, 2024

Per the error message, it looks like the closure is causing issues where it can't be compiled due to mixed activity [e.g. one element of the closure being differentiable the other not]. As the error message states, you can enable runtime activity to resolve at a slight [but not asymptotic performance loss]. But also using multiple arguments here in Enzyme directly would also presumably resolve this

@gdalle
Copy link
Member

gdalle commented Jul 16, 2024

Per the error message, it looks like the closure is causing issues where it can't be compiled due to mixed activity [e.g. one element of the closure being differentiable the other not]. As the error message states, you can enable runtime activity to resolve at a slight [but not asymptotic performance loss].

I tried enabling runtime activity but the stack traces just got angrier 😱

But also using multiple arguments here in Enzyme directly would also presumably resolve this

Agreed, but setting this up in the single-argument case also helped me resolve some other issues to pave the way for non-array testing, so it's a great first step. For the multi-argument case, what I need most of all is inspiration and ideas to improve my initial API proposal at #311 (comment). The idea is to be slightly different than Enzyme, and closer to the mathematical concepts that users may be familiar with.

@gdalle
Copy link
Member

gdalle commented Jul 17, 2024

Merging this without the Enzyme tests because there's already some useful stuff in there. Billy if you have a clue what the problem was in https://github.com/gdalle/DifferentiationInterface.jl/actions/runs/9963932557/job/27531092106 with runtime activity switched on, I'd love to hear your thoughts

@gdalle gdalle merged commit f4ba734 into JuliaDiff:main Jul 17, 2024
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add neural network gradient tests [Juliacon 2024 Hackathon]
4 participants