Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup for v0.14 release #2283

Merged
merged 11 commits into from
Jul 12, 2023
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.

Works best with [Julia 1.8](https://julialang.org/downloads/) or later. Here's a very short example to try it out:
Works best with [Julia 1.9](https://julialang.org/downloads/) or later. Here's a very short example to try it out:
```julia
using Flux, Plots
data = [([x], 2x-x^3) for x in -2:0.1f0:2]
Expand Down
13 changes: 12 additions & 1 deletion docs/src/gpu.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,22 @@
# GPU Support

NVIDIA GPU support should work out of the box on systems with CUDA and CUDNN installed. For more details see the [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) readme.
Starting with v0.14, Flux doesn't force a specific GPU backend and the corresponding package dependencies on the users.
Thanks to the [package extension mechanism](
mcabbott marked this conversation as resolved.
Show resolved Hide resolved
https://pkgdocs.julialang.org/v1/creating-packages/#Conditional-loading-of-code-in-packages-(Extensions)) introduced in julia v1.9, Flux conditionally load GPU specific code once a GPU package is made available (e.g. through `using CUDA`).

NVIDIA GPU support requires the packages `CUDA.jl` and `cuDNN.jl` to be installed in the environment. In the julia REPL, type `] add CUDA, cuDNN` to install them. For more details see the [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) readme.

AMD GPU support is available since Julia 1.9 on systems with ROCm and MIOpen installed. For more details refer to the [AMDGPU.jl](https://github.com/JuliaGPU/AMDGPU.jl) repository.

Metal GPU acceleration is available on Apple Silicon hardware. For more details refer to the [Metal.jl](https://github.com/JuliaGPU/Metal.jl) repository. Metal support in Flux is experimental and many features are not yet available.

In order to trigger GPU support in Flux, you need to call `using CUDA`, `using AMDGPU` or `using Metal`
in your code. Notice that for CUDA, explicitely loading also `cuDNN` is not required, but the package has to be installed in the environment.
CarloLucibello marked this conversation as resolved.
Show resolved Hide resolved


!!! compat "Flux ≤ 0.13"
Old versions of Flux automatically installed CUDA.jl to provide GPU support. Starting from Flux v0.14, CUDA.jl is not a dependency anymore and has to be installed manually.
CarloLucibello marked this conversation as resolved.
Show resolved Hide resolved

## Checking GPU Availability

By default, Flux will run the checks on your system to see if it can support GPU functionality. You can check if Flux identified a valid GPU setup by typing the following:
Expand Down
3 changes: 2 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ Flux is a library for machine learning. It comes "batteries-included" with many

### Installation

Download [Julia 1.9](https://julialang.org/downloads/) or later, preferably the current stable release. You can add Flux using Julia's package manager, by typing `] add Flux` in the Julia prompt. This will automatically install several other packages, including [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) for Nvidia GPU support.
Download [Julia 1.9](https://julialang.org/downloads/) or later, preferably the current stable release. You can add Flux using Julia's package manager, by typing `] add Flux` in the Julia prompt.
For Nvidia GPU support, you will also need to install the `CUDA` and the `cuDNN` packages. For AMD GPU support, install the `AMDGPU` package. For acceleration on Apple Silicon, install the `Metal` package.

### Learning Flux

Expand Down
4 changes: 2 additions & 2 deletions docs/src/models/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ However, doing this requires the `struct` to have a corresponding constructor th

When it is desired to not include all the model parameters (for e.g. transfer learning), we can simply not pass in those layers into our call to `params`.

!!! compat "Flux ≤ 0.13"
!!! compat "Flux ≤ 0.14"
The mechanism described here is for Flux's old "implicit" training style.
When upgrading for Flux 0.14, it should be replaced by [`freeze!`](@ref Flux.freeze!) and `thaw!`.
When upgrading for Flux 0.15, it should be replaced by [`freeze!`](@ref Flux.freeze!) and `thaw!`.

Consider a simple multi-layer perceptron model where we want to avoid optimising the first two `Dense` layers. We can obtain
this using the slicing features `Chain` provides:
Expand Down
2 changes: 1 addition & 1 deletion docs/src/models/layers.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Perhaps `Scale` isn't quite fully connected, but it may be thought of as `Dense(

!!! compat "Flux ≤ 0.12"
Old versions of Flux accepted only `Dense(in, out, act)` and not `Dense(in => out, act)`.
This notation makes a `Pair` object. If you get an error like `MethodError: no method matching Dense(::Pair{Int64,Int64})`, this means that you should upgrade to Flux 0.13.
This notation makes a `Pair` object. If you get an error like `MethodError: no method matching Dense(::Pair{Int64,Int64})`, this means that you should upgrade to newer Flux versions.


## Convolution Models
Expand Down
8 changes: 4 additions & 4 deletions docs/src/models/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ If you have used neural networks before, then this simple example might be helpf
If you haven't, then you might prefer the [Fitting a Straight Line](overview.md) page.

```julia
# With Julia 1.7+, this will prompt if neccessary to install everything, including CUDA:
using Flux, Statistics, ProgressMeter
# This will prompt if neccessary to install everything, including CUDA:
using Flux, CUDA, Statistics, ProgressMeter

# Generate some data for the XOR problem: vectors of length 2, as columns of a matrix:
noisy = rand(Float32, 2, 1000) # 2×1000 Matrix{Float32}
Expand Down Expand Up @@ -102,7 +102,7 @@ for epoch in 1:1_000
end
```

!!! compat "Implicit-style training, Flux ≤ 0.13"
!!! compat "Implicit-style training, Flux ≤ 0.14"
Until recently Flux's training worked a bit differently.
Any code which looks like
```
Expand All @@ -113,5 +113,5 @@ end
train!((x,y) -> loss(model, x, y), Flux.params(model), loader, opt)
```
(with `Flux.params`) is in the old "implicit" style.
This still works on Flux 0.13, but will be removed from Flux 0.14.
This still works on Flux 0.14, but will be removed from Flux 0.15.
See the [training section](@ref man-training) for more details.
6 changes: 3 additions & 3 deletions docs/src/training/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Because of this:
* Flux defines its own version of `setup` which checks this assumption.
(Using instead `Optimisers.setup` will also work, they return the same thing.)

The new implementation of rules such as Adam in the Optimisers is quite different from the old one in `Flux.Optimise`. In Flux 0.13, `Flux.Adam()` returns the old one, with supertype `Flux.Optimise.AbstractOptimiser`, but `setup` will silently translate it to its new counterpart.
The new implementation of rules such as Adam in the Optimisers is quite different from the old one in `Flux.Optimise`. In Flux 0.14, `Flux.Adam()` returns the old one, with supertype `Flux.Optimise.AbstractOptimiser`, but `setup` will silently translate it to its new counterpart.
The available rules are listed the [optimisation rules](@ref man-optimisers) page here;
see the [Optimisers documentation](https://fluxml.ai/Optimisers.jl/dev/) for details on how the new rules work.

Expand All @@ -37,11 +37,11 @@ Optimisers.freeze!
Optimisers.thaw!
```

## Implicit style (Flux ≤ 0.13)
## Implicit style (Flux ≤ 0.14)

Flux used to handle gradients, training, and optimisation rules quite differently.
The new style described above is called "explicit" by Zygote, and the old style "implicit".
Flux 0.13 is the transitional version which supports both; Flux 0.14 will remove the old.
Flux 0.13 and 0.14 are the transitional version which supports both; Flux 0.15 will remove the old.

!!! compat "How to upgrade"
The blue-green boxes in the [training section](@ref man-training) describe
Expand Down
14 changes: 7 additions & 7 deletions docs/src/training/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,14 +65,14 @@ It is also important that every `update!` step receives a newly gradient compute
as this will be change whenever the model's parameters are changed, and for each new data point.

!!! compat "Implicit gradients"
Flux ≤ 0.13 used Zygote's "implicit" mode, in which `gradient` takes a zero-argument function.
Flux ≤ 0.14 used Zygote's "implicit" mode, in which `gradient` takes a zero-argument function.
It looks like this:
```
pars = Flux.params(model)
grad = gradient(() -> loss(model(input), label), pars)
```
Here `pars::Params` and `grad::Grads` are two dictionary-like structures.
Support for this will be removed from Flux 0.14, and these blue (teal?) boxes
Support for this will be removed from Flux 0.15, and these blue (teal?) boxes
explain what needs to change.

## Loss Functions
Expand All @@ -90,7 +90,7 @@ like [`mse`](@ref Flux.Losses.mse) for mean-squared error or [`crossentropy`](@r
are available from the [`Flux.Losses`](../models/losses.md) module.

!!! compat "Implicit-style loss functions"
Flux ≤ 0.13 needed a loss function which closed over a reference to the model,
Flux ≤ 0.14 needed a loss function which closed over a reference to the model,
instead of being a pure function. Thus in old code you may see something like
```
loss(x, y) = sum((model(x) .- y).^2)
Expand Down Expand Up @@ -211,7 +211,7 @@ Or explicitly writing the anonymous function which this `do` block creates,
!!! compat "Implicit-style `train!`"
This is a new method of `train!`, which takes the result of `setup` as its 4th argument.
The 1st argument is a function which accepts the model itself.
Flux versions ≤ 0.13 provided a method of `train!` for "implicit" parameters,
Flux versions ≤ 0.14 provided a method of `train!` for "implicit" parameters,
which works like this:
```
train!((x,y) -> loss(model(x), y), Flux.params(model), train_set, Adam())
Expand Down Expand Up @@ -342,7 +342,7 @@ for epoch in 1:1000
end
```

!!! compat "Flux ≤ 0.13"
!!! compat "Flux ≤ 0.14"
With the old "implicit" optimiser, `opt = Adam(0.1)`, the equivalent was to
directly mutate the `Adam` struct, `opt.eta = 0.001`.

Expand Down Expand Up @@ -374,7 +374,7 @@ train!(loss, bimodel, data, opt_state)
Flux.thaw!(opt_state)
```

!!! compat "Flux ≤ 0.13"
!!! compat "Flux ≤ 0.14"
The earlier "implicit" equivalent was to pass to `gradient` an object referencing only
part of the model, such as `Flux.params(bimodel.layers.enc)`.

Expand All @@ -383,7 +383,7 @@ Flux.thaw!(opt_state)

Flux used to handle gradients, training, and optimisation rules quite differently.
The new style described above is called "explicit" by Zygote, and the old style "implicit".
Flux 0.13 is the transitional version which supports both.
Flux 0.13 and 0.14 are the transitional versions which support both.

The blue-green boxes above describe the changes.
For more details on training in the implicit style, see [Flux 0.13.6 documentation](https://fluxml.ai/Flux.jl/v0.13.6/training/training/).
Expand Down
4 changes: 2 additions & 2 deletions docs/src/training/zygote.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@ Zygote.hessian_reverse
Zygote.diaghessian
```

## Implicit style (Flux ≤ 0.13)
## Implicit style (Flux ≤ 0.14)

Flux used to use what Zygote calls "implicit" gradients, [described here](https://fluxml.ai/Zygote.jl/dev/#Explicit-and-Implicit-Parameters-1) in its documentation.
However, support for this will be removed from Flux 0.14.
However, support for this will be removed from Flux 0.15.

!!! compat "Training"
The blue-green boxes in the [training section](@ref man-training) describe
Expand Down
1 change: 0 additions & 1 deletion docs/src/utilities.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ These functions call:

```@docs
Flux.rng_from_array
Flux.default_rng_value
Flux.nfan
```

Expand Down
2 changes: 1 addition & 1 deletion src/Flux.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ using MacroTools: @forward
using MLUtils
import Optimisers: Optimisers, trainable, destructure # before v0.13, Flux owned these functions
using Optimisers: freeze!, thaw!, adjust!

using Random: default_rng
using Zygote, ChainRulesCore
using Zygote: Params, @adjoint, gradient, pullback
using Zygote.ForwardDiff: value
Expand Down
28 changes: 8 additions & 20 deletions src/deprecations.jl
Original file line number Diff line number Diff line change
@@ -1,19 +1,3 @@
# v0.12 deprecations

function ones(dims...)
Base.depwarn("Flux.ones(size...) is deprecated, please use Flux.ones32(size...) or Base.ones(Float32, size...)", :ones, force=true)
Base.ones(Float32, dims...)
end
ones(T::Type, dims...) = Base.ones(T, dims...)

function zeros(dims...)
Base.depwarn("Flux.zeros(size...) is deprecated, please use Flux.zeros32(size...) or Base.zeros(Float32, size...)", :zeros, force=true)
Base.zeros(Float32, dims...)
end
zeros(T::Type, dims...) = Base.zeros(T, dims...)

ones32(::Type, dims...) = throw(ArgumentError("Flux.ones32 is always Float32, use Base.ones to specify the element type"))
zeros32(::Type, dims...) = throw(ArgumentError("Flux.zeros32 is always Float32, use Base.zeros to specify the element type"))

# v0.13 deprecations

Expand Down Expand Up @@ -59,7 +43,7 @@ function loadparams!(m, xs)
end

# Channel notation: Changed to match Conv, but very softly deprecated!
# Perhaps change to @deprecate for v0.14, but there is no plan to remove these.
# Perhaps change to @deprecate for v0.15, but there is no plan to remove these.
Dense(in::Integer, out::Integer, σ = identity; kw...) =
Dense(in => out, σ; kw...)
Bilinear(in1::Integer, in2::Integer, out::Integer, σ = identity; kw...) =
Expand All @@ -86,7 +70,7 @@ Base.@deprecate_binding Data Flux false "Sub-module Flux.Data has been removed.

@deprecate paramtype(T,m) _paramtype(T,m) false # internal method, renamed to make this clear

@deprecate rng_from_array() default_rng_value()
@deprecate rng_from_array() Random.default_rng()

function istraining()
Base.depwarn("Flux.istraining() is deprecated, use NNlib.within_gradient(x) instead", :istraining)
Expand Down Expand Up @@ -216,13 +200,17 @@ ChainRulesCore.@non_differentiable _greek_ascii_depwarn(::Any...)


# v0.14 deprecations
@deprecate default_rng_value() Random.default_rng()


# v0.15 deprecations

# Enable these when 0.14 is released, and delete const ClipGrad = Optimise.ClipValue etc:
# Enable these when 0.15 is released, and delete const ClipGrad = Optimise.ClipValue etc:
CarloLucibello marked this conversation as resolved.
Show resolved Hide resolved
# Base.@deprecate_binding Optimiser OptimiserChain
# Base.@deprecate_binding ClipValue ClipGrad

# train!(loss::Function, ps::Zygote.Params, data, opt) = throw(ArgumentError(
# """On Flux 0.14, `train!` no longer accepts implicit `Zygote.Params`.
# """On Flux 0.15, `train!` no longer accepts implicit `Zygote.Params`.
# Instead of `train!(loss_xy, Flux.params(model), data, Adam())`
# it now needs `opt = Flux.setup(Adam(), model); train!(loss_mxy, model, data, opt)`
# where `loss_mxy` accepts the model as its first argument.
Expand Down
10 changes: 5 additions & 5 deletions src/layers/normalise.jl
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,9 @@ mutable struct Dropout{F<:Real,D,R<:AbstractRNG}
active::Union{Bool, Nothing}
rng::R
end
Dropout(p::Real, dims, active) = Dropout(p, dims, active, default_rng_value())
Dropout(p::Real, dims, active) = Dropout(p, dims, active, default_rng())

function Dropout(p::Real; dims=:, active::Union{Bool,Nothing} = nothing, rng = default_rng_value())
function Dropout(p::Real; dims=:, active::Union{Bool,Nothing} = nothing, rng = default_rng())
0 ≤ p ≤ 1 || throw(ArgumentError("Dropout expects 0 ≤ p ≤ 1, got p = $p"))
Dropout(p, dims, active, rng)
end
Expand Down Expand Up @@ -125,8 +125,8 @@ mutable struct AlphaDropout{F,R<:AbstractRNG}
rng::R
end

AlphaDropout(p, active) = AlphaDropout(p, active, default_rng_value())
function AlphaDropout(p; rng = default_rng_value(), active::Union{Bool,Nothing} = nothing)
AlphaDropout(p, active) = AlphaDropout(p, active, default_rng())
function AlphaDropout(p; rng = default_rng(), active::Union{Bool,Nothing} = nothing)
0 ≤ p ≤ 1 || throw(ArgumentError("AlphaDropout expects 0 ≤ p ≤ 1, got p = $p"))
AlphaDropout(p, active, rng)
end
Expand Down Expand Up @@ -520,7 +520,7 @@ function GroupNorm(chs::Int, G::Int, λ=identity;
eps::Real=1f-5, momentum::Real=0.1f0, ϵ=nothing)

if track_stats
Base.depwarn("`track_stats=true` will be removed from GroupNorm in Flux 0.14. The default value is `track_stats=false`, which will work as before.", :GroupNorm)
Base.depwarn("`track_stats=true` will be removed from GroupNorm in Flux 0.15. The default value is `track_stats=false`, which will work as before.", :GroupNorm)
CarloLucibello marked this conversation as resolved.
Show resolved Hide resolved
end
ε = _greek_ascii_depwarn(ϵ => eps, :GroupNorm, "ϵ" => "eps")

Expand Down
4 changes: 2 additions & 2 deletions src/optimise/optimisers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -566,7 +566,7 @@ that will be fed into the next, and this is finally applied to the parameter as
usual.

!!! note
This will be replaced by `Optimisers.OptimiserChain` in Flux 0.14.
This will be replaced by `Optimisers.OptimiserChain` in Flux 0.15.
"""
mutable struct Optimiser <: AbstractOptimiser
os::Vector{Any}
Expand Down Expand Up @@ -704,7 +704,7 @@ end
Clip gradients when their absolute value exceeds `thresh`.

!!! note
This will be replaced by `Optimisers.ClipGrad` in Flux 0.14.
This will be replaced by `Optimisers.ClipGrad` in Flux 0.15.
"""
mutable struct ClipValue{T} <: AbstractOptimiser
thresh::T
Expand Down
16 changes: 8 additions & 8 deletions src/optimise/train.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ As a result, the parameters are mutated and the optimiser's internal state may c
The gradient could be mutated as well.

!!! compat "Deprecated"
This method for implicit `Params` (and `AbstractOptimiser`) will be removed from Flux 0.14.
This method for implicit `Params` (and `AbstractOptimiser`) will be removed from Flux 0.15.
The explicit method `update!(opt, model, grad)` from Optimisers.jl will remain.
"""
function update!(opt::AbstractOptimiser, x::AbstractArray, x̄)
Expand Down Expand Up @@ -46,7 +46,7 @@ Call `Flux.skip()` in a callback to indicate when a callback condition is met.
This will trigger the train loop to skip the current data point and not update with the calculated gradient.

!!! note
`Flux.skip()` will be removed from Flux 0.14
`Flux.skip()` will be removed from Flux 0.15

# Examples
```julia
Expand All @@ -56,7 +56,7 @@ end
```
"""
function skip()
Base.depwarn("""Flux.skip() will be removed from Flux 0.14.
Base.depwarn("""Flux.skip() will be removed from Flux 0.15.
and should be replaced with `continue` in an ordinary `for` loop.""", :skip)
throw(SkipException())
end
Expand All @@ -71,7 +71,7 @@ Call `Flux.stop()` in a callback to indicate when a callback condition is met.
This will trigger the train loop to stop and exit.

!!! note
`Flux.stop()` will be removed from Flux 0.14. It should be replaced with `break` in an ordinary `for` loop.
`Flux.stop()` will be removed from Flux 0.15. It should be replaced with `break` in an ordinary `for` loop.

# Examples
```julia
Expand All @@ -81,7 +81,7 @@ end
```
"""
function stop()
Base.depwarn("""Flux.stop() will be removed from Flux 0.14.
Base.depwarn("""Flux.stop() will be removed from Flux 0.15.
CarloLucibello marked this conversation as resolved.
Show resolved Hide resolved
It should be replaced with `break` in an ordinary `for` loop.""", :stop)
throw(StopException())
end
Expand All @@ -96,7 +96,7 @@ Uses a `loss` function and training `data` to improve the
model's parameters according to a particular optimisation rule `opt`.

!!! compat "Deprecated"
This method with implicit `Params` will be removed from Flux 0.14.
This method with implicit `Params` will be removed from Flux 0.15.
It should be replaced with the explicit method `train!(loss, model, data, opt)`.

For each `d in data`, first the gradient of the `loss` is computed like this:
Expand Down Expand Up @@ -167,7 +167,7 @@ Run `body` `N` times. Mainly useful for quickly doing multiple epochs of
training in a REPL.

!!! note
The macro `@epochs` will be removed from Flux 0.14. Please just write an ordinary `for` loop.
The macro `@epochs` will be removed from Flux 0.15. Please just write an ordinary `for` loop.
CarloLucibello marked this conversation as resolved.
Show resolved Hide resolved

# Examples
```julia
Expand All @@ -179,7 +179,7 @@ hello
```
"""
macro epochs(n, ex)
Base.depwarn("""The macro `@epochs` will be removed from Flux 0.14.
Base.depwarn("""The macro `@epochs` will be removed from Flux 0.15.
As an alternative, you can write a simple `for i in 1:epochs` loop.""", Symbol("@epochs"), force=true)
:(@progress for i = 1:$(esc(n))
@info "Epoch $i"
Expand Down
Loading