Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some typos in docs #2418

Merged
merged 1 commit into from
Mar 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/src/models/recurrence.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ To introduce Flux's recurrence functionalities, we will consider the following v

In the above, we have a sequence of length 3, where `x1` to `x3` represent the input at each step (could be a timestamp or a word in a sentence), and `y1` to `y3` are their respective outputs.

An aspect to recognize is that in such a model, the recurrent cells `A` all refer to the same structure. What distinguishes it from a simple dense layer is that the cell `A` is fed, in addition to an input `x`, with information from the previous state of the model (hidden state denoted as `h1` & `h2` in the diagram).
An aspect to recognise is that in such a model, the recurrent cells `A` all refer to the same structure. What distinguishes it from a simple dense layer is that the cell `A` is fed, in addition to an input `x`, with information from the previous state of the model (hidden state denoted as `h1` & `h2` in the diagram).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't get around to this in time. In case this comes up again, most of the Flux core team uses the American spelling. Not worth a whole change now, but I wanted to have that known if anyone sees this change and wants to make a similar one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the impression that mostly British spelling is used (e.g. it is called Optimiser.jl and not Optimizer.jl).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original author was from the UK. Unfortunately we can't change package names so easily, but everything else uses the non-british spelling.


In the most basic RNN case, cell A could be defined by the following:

Expand Down
4 changes: 2 additions & 2 deletions docs/src/performance.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# [Performance Tips]((@id man-performance-tips))
# [Performance Tips](@id man-performance-tips)

All the usual [Julia performance tips apply](https://docs.julialang.org/en/v1/manual/performance-tips/).
As always [profiling your code](https://docs.julialang.org/en/v1/manual/profile/#Profiling-1) is generally a useful way of finding bottlenecks.
Expand Down Expand Up @@ -44,7 +44,7 @@ While one could change the activation function (e.g. to use `0.01f0*x`), the idi
leaky_tanh(x) = oftype(x/1, 0.01)*x + tanh(x)
```

## Evaluate batches as Matrices of features
## Evaluate batches as matrices of features

While it can sometimes be tempting to process your observations (feature vectors) one at a time
e.g.
Expand Down
2 changes: 1 addition & 1 deletion docs/src/saving.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ jldsave("checkpoint_epoch=42.jld2"; model_state, opt_state)
Models are just normal Julia structs, so it's fine to use any Julia storage
format to save the struct as it is instead of saving the state returned by [`Flux.state`](@ref).
[BSON.jl](https://github.com/JuliaIO/BSON.jl) is particularly convenient for this,
since it can also save anynomous functions, which are sometimes part of a model definition.
since it can also save anonymous functions, which are sometimes part of a model definition.

Save a model:

Expand Down
10 changes: 5 additions & 5 deletions docs/src/training/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ end
```

This loop can also be written using the function [`train!`](@ref Flux.Train.train!),
but it's helpful to undersand the pieces first:
but it's helpful to understand the pieces first:

```julia
train!(model, train_set, opt_state) do m, x, y
Expand All @@ -43,7 +43,7 @@ end

## Model Gradients

Fist recall from the section on [taking gradients](@ref man-training) that
Fist recall from the section on [taking gradients](@ref man-taking-gradients) that
`Flux.gradient(f, a, b)` always calls `f(a, b)`, and returns a tuple `(∂f_∂a, ∂f_∂b)`.
In the code above, the function `f` passed to `gradient` is an anonymous function with
one argument, created by the `do` block, hence `grads` is a tuple with one element.
Expand Down Expand Up @@ -275,10 +275,10 @@ end
The term *regularisation* covers a wide variety of techniques aiming to improve the
result of training. This is often done to avoid overfitting.

Some of these are can be implemented by simply modifying the loss function.
Some of these can be implemented by simply modifying the loss function.
*L₂ regularisation* (sometimes called ridge regression) adds to the loss a penalty
proportional to `θ^2` for every scalar parameter.
For a very simple model could be implemented as follows:
A very simple model could be implemented as follows:

```julia
grads = Flux.gradient(densemodel) do m
Expand Down Expand Up @@ -318,7 +318,7 @@ decay_opt_state = Flux.setup(OptimiserChain(WeightDecay(0.42), Adam(0.1)), model

Flux's optimisers are really modifications applied to the gradient before using it to update
the parameters, and `OptimiserChain` applies two such modifications.
The first, [`WeightDecay`](@ref Flux.WeightDecay) adds `0.42` times original parameter to the gradient,
The first, [`WeightDecay`](@ref Flux.WeightDecay) adds `0.42` times the original parameter to the gradient,
matching the gradient of the penalty above (with the same, unrealistically large, constant).
After that, in either case, [`Adam`](@ref Flux.Adam) computes the final update.

Expand Down
2 changes: 1 addition & 1 deletion docs/src/tutorials/logistic_regression.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ julia> x |> summary

The `y` values here corresponds to a type of iris plant, with a total of 150 data points. The `x` values depict the sepal length, sepal width, petal length, and petal width (all in `cm`) of 150 iris plant (hence the matrix size `4×150`). Different type of iris plants have different lengths and widths of sepals and petals associated with them, and there is a definitive pattern for this in nature. We can leverage this to train a simple classifier that outputs the type of iris plant using the length and width of sepals and petals as inputs.

Our next step would be to convert this data into a form that can be fed to a machine learning model. The `x` values are arranged in a matrix and should ideally be converted to `Float32` type (see [Performance tips](@ref id-man-performance-tips)), but the labels must be one hot encoded. [Here](https://discourse.julialang.org/t/all-the-ways-to-do-one-hot-encoding/64807) is a great discourse thread on different techniques that can be used to one hot encode data with or without using any external Julia package.
Our next step would be to convert this data into a form that can be fed to a machine learning model. The `x` values are arranged in a matrix and should ideally be converted to `Float32` type (see [Performance tips](@ref man-performance-tips)), but the labels must be one hot encoded. [Here](https://discourse.julialang.org/t/all-the-ways-to-do-one-hot-encoding/64807) is a great discourse thread on different techniques that can be used to one hot encode data with or without using any external Julia package.

```jldoctest logistic_regression
julia> x = Float32.(x);
Expand Down
Loading