-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a macro to opt-in to fancy printing, and to everything else #1932
Conversation
Looks reasonable, so if you'd let me indulge in some light bikeshedding how about |
Not at all set on the name. Another question is whether there should be a matching way to disable recursion. Overloading
If this needs a public face, then perhaps it should be more like |
We could also merge this PR now since it seems ready and expose that interface later |
6370374 upgrades this to a replacement for Still WIP, but roughly works, I think. |
src/layers/macro.jl
Outdated
* In fact you can provide an arbitrary keyword with this syntax, and it will | ||
overload this function alla `trainable`... that might be a terrible idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a start, we could limit the allowable keyword list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Nice to pick a notation that can grow if desired though.
src/layers/macro.jl
Outdated
@layer Dense | ||
@layer :expand Chain | ||
@layer BatchNorm trainable=(β,γ) | ||
@layer Struct functor=(α,β) trainable=(β,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're at it, perhaps functor
could be aliased to something less obscure? params
might be a name conflict but is the most familiar (perhaps parameters
? Too cheeky?). buffers
is close to PyTorch but still kind of obscure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not leaves
or children
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do. I guess I wanted to match the function name, and if this one is obscure, that's OK, you should almost never use it. children
does match a function name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
children
also has the benefit of having the same signature as trainable
: () -> Union{Tuple,NamedTuple,Array,...}
instead of () -> Tuple{Union{Tuple,NamedTuple,Array,...}, Function}
for functor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version uses children
as the keyword.
85252bf
to
8981283
Compare
function _check_new_macro(x::T) where T | ||
Functors.isleaf(x) && return | ||
Base.depwarn("This type should probably now use `Flux.@layer` instead of `@functor`: $T", Symbol("@functor")) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I toned this warning down from @warn
. I believe this makes the PR non-breaking, could be in a 0.13 release. Possibly 0.14 should always warn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A related question. Currently many people use Flux.@functor
. We could define such a macro as a pass-through to @layer
, which will print a depwarn once on the macro call. Should we?
src/layers/macro.jl
Outdated
else | ||
remake(nt) = Base.typename(T).wrapper(map(f -> f in which ? getfield(nt, f) : getfield(x, f), fieldnames(T))...) | ||
NamedTuple{which}(map(s -> getfield(x, s), which)), remake | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path when you specify children
isn't as fast. Maybe this doesn't matter for now, since no layer within Flux does so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be why Functors.jl uses @eval
. A generated block seems fine though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure it can be done with a @generated
block. I was just lazy to actually write and check and benchmark it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done with @generated
? It seems you can't define closures in generated functions, and we need to close over x
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's true. But perhaps it can instead return a struct like Remake{(:some, :fields)}(x, constructor)
which gets some fields from x
and some from what it's called on. But I didn't try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also horrified by the number of ways to customise recursive behaviour we have, and I wonder if this one should just be forbidden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever, let me try that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait isn't returning a Remake
struct the same as returning a closure (in terms of performance not whether it can be in a generated function)? I thought the benefit here was that you can manually write out the iteration accessing the fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I would guess so. If necc. its call could itself be a generated thing, but quite likely some ntuple / map would just compile down.
Not precisely sure what list of fields this thing ought to be passed, the ones you get or the ones you don't or some mix such that merge
sorts it out?
@functor BatchNorm | ||
trainable(bn::BatchNorm) = hasaffine(bn) ? (β = bn.β, γ = bn.γ) : (;) | ||
@layer BatchNorm trainable=(β,γ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalisation layers all have permanent trainable
, but the fields are nothing
when not in use, should be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Really, the affine params should be wrapped up in a Scale
sublayer like LayerNorm has, but that's a PR for another day.
@@ -1,22 +1,30 @@ | |||
@nospecialize # just for this file, for startup time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes show is visibly slow on first loading. This doesn't really help, a test with @time show(stdout, MIME"text/plain"(), model);
goes from 2.28s to 2.15s.
@@ -216,7 +216,7 @@ m(5) # => 26 | |||
Flux provides a set of helpers for custom layers, which you can enable by calling | |||
|
|||
```julia | |||
Flux.@functor Affine | |||
Flux.@layer Affine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"set of helpers for custom layers" is more accurate after than before. Where else would docs need changing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/advanced.md is the big one, off the top of my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a go at editing that. It's quite a crazy scheme we have going, where trainable
, functor
, adapt
define 3 different recursions. Do we really need to ways to customise functor
?
Doctest failure looks real, but I'm not sure why it's only showing up now. |
Doc failure seems to be that this doesn't work on 1.6, where the first argument is a vector:
Downstream tests didn't run, not sure why. |
Failures look unrelated:
|
So just the doctest fixup for 1.6? |
d891f6d
to
55ebc57
Compare
fixup tidy up, add NEWS review suggestions macro docstring, incl. hcat(3.3)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1932 +/- ##
===========================================
+ Coverage 42.55% 73.91% +31.35%
===========================================
Files 31 32 +1
Lines 1786 1909 +123
===========================================
+ Hits 760 1411 +651
+ Misses 1026 498 -528 ☔ View full report in Codecov by Sentry. |
Rebased in 2024. We should just do this, right? The biggest questions is: Should this support children at all? Or can we simply forbid that... do we really need 4 ways to customise things, alla #2385? Maybe if you want to hack things with Functors.jl directly you are welcome, but Flux.jl can pretend this doesn't exist. |
I agree. Given that you almost never want to set |
OK, This should be ready to merge, I think. As always there may be further cleaning up of the docs possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this.
Originally,
@big_show
This adds a macro which lets you tell Flux to treat some container type much the same way as it treats
Chain
, starting the recursive printing walk over all children.Prompted by #1929. Closes #2044
Metalhead also has
show
methods for top-level layers which enable the recursive printing, which could be replaced by this macro:https://github.com/FluxML/Metalhead.jl/blob/aba6fb832093d88dc2d2b4d5b1d2d63a0f21eb9c/src/Metalhead.jl#L53-L57
https://github.com/FluxML/Metalhead.jl/blob/c8f0a88e4d24274c53b62c2b740822aa6a781709/src/utilities.jl#L49-L52
Searching for other uses on github, I see only two:
https://github.com/avik-pal/ExplicitFluxLayers.jl/blob/8e1cff447afda225bc12144ff27ae1370e8fc3da/src/show_layers.jl
https://github.com/maxfreu/SegmentationModels.jl/blob/7bfdbaa438910baf49543f03f2931de765dbd761/src/unet.jl#L99-L112
Later,
@layer
Now aims to also replace
@functor
, in addition to handlingshow
.Discussed at #2028 -- a layer supertype is another way to make opting into pretty printing easier. (We wouldn't do both, so this closes #2028 .)
PR Checklist