From bca3347de845e7adf0102a8f2e03ef179fd6a4df Mon Sep 17 00:00:00 2001 From: Nicholas Bauer Date: Wed, 27 Mar 2024 15:54:28 -0400 Subject: [PATCH 1/3] Make sure first example uses type parameter --- docs/src/models/advanced.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/models/advanced.md b/docs/src/models/advanced.md index fb36553788..f5154e6274 100644 --- a/docs/src/models/advanced.md +++ b/docs/src/models/advanced.md @@ -7,8 +7,8 @@ Here we will try and describe usage of some more advanced features that Flux pro Here is a basic example of a custom model. It simply adds the input to the result from the neural network. ```julia -struct CustomModel - chain::Chain +struct CustomModel{T <: Chain} # Parameter to avoid type instability (see "Multiple inputs" section below) + chain::T end function (m::CustomModel)(x) From fc3624a872d9067ff3820a8a992cd16d37a2748a Mon Sep 17 00:00:00 2001 From: Nicholas Bauer Date: Fri, 29 Mar 2024 18:41:10 -0400 Subject: [PATCH 2/3] Move performance explanation up and expand a bit. --- docs/src/models/advanced.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/src/models/advanced.md b/docs/src/models/advanced.md index f5154e6274..57c4ae0e90 100644 --- a/docs/src/models/advanced.md +++ b/docs/src/models/advanced.md @@ -7,7 +7,7 @@ Here we will try and describe usage of some more advanced features that Flux pro Here is a basic example of a custom model. It simply adds the input to the result from the neural network. ```julia -struct CustomModel{T <: Chain} # Parameter to avoid type instability (see "Multiple inputs" section below) +struct CustomModel{T <: Chain} # Parameter to avoid type instability chain::T end @@ -21,6 +21,7 @@ end # Call @layer to allow for training. Described below in more detail. Flux.@layer CustomModel ``` +Notice that we parameterized the type of the `chain` field. This is necessary for fast Julia code, so that that struct field can be given a concrete type. `Chain`s have a type parameter fully specifying the types of the layers they contain. By using a type parameter, we are freeing Julia to determine the correct concrete type, so that we do not need to specify the full, possibly quite long, type ourselves. You can then use the model like: @@ -140,7 +141,7 @@ end # allow Join(op, m1, m2, ...) as a constructor Join(combine, paths...) = Join(combine, paths) ``` -Notice that we parameterized the type of the `paths` field. This is necessary for fast Julia code; in general, `T` might be a `Tuple` or `Vector`, but we don't need to pay attention to what it specifically is. The same goes for the `combine` field. +Notice again that we parameterized the type of the `paths` field. In addition to the performance considerations of concrete types, this allows either field to be `Vector`s, `Tuple`s, or one of each - we don't need to pay attention to which. The next step is to use [`Flux.@layer`](@ref) to make our struct behave like a Flux layer. This is important so that calling `Flux.setup` on a `Join` maps over the underlying trainable arrays on each path. ```julia From b8419967ca3d4b9d7f6df86ff3796444d6e4d2bc Mon Sep 17 00:00:00 2001 From: Nicholas Bauer Date: Fri, 29 Mar 2024 18:42:19 -0400 Subject: [PATCH 3/3] tweak --- docs/src/models/advanced.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/models/advanced.md b/docs/src/models/advanced.md index 57c4ae0e90..cf2d1fedb3 100644 --- a/docs/src/models/advanced.md +++ b/docs/src/models/advanced.md @@ -141,7 +141,7 @@ end # allow Join(op, m1, m2, ...) as a constructor Join(combine, paths...) = Join(combine, paths) ``` -Notice again that we parameterized the type of the `paths` field. In addition to the performance considerations of concrete types, this allows either field to be `Vector`s, `Tuple`s, or one of each - we don't need to pay attention to which. +Notice again that we parameterized the type of the `combine` and `paths` fields. In addition to the performance considerations of concrete types, this allows either field to be `Vector`s, `Tuple`s, or one of each - we don't need to pay attention to which. The next step is to use [`Flux.@layer`](@ref) to make our struct behave like a Flux layer. This is important so that calling `Flux.setup` on a `Join` maps over the underlying trainable arrays on each path. ```julia