diff --git a/docs/src/training/training.md b/docs/src/training/training.md
index 3070494188..623b4788fc 100644
--- a/docs/src/training/training.md
+++ b/docs/src/training/training.md
@@ -61,8 +61,8 @@ then the derivative of the loss with respect to it is `∂f_∂θ = grads[1].lay
 It is important that the execution of the model takes place inside the call to `gradient`,
 in order for the influence of the model's parameters to be observed by Zygote.
 
-It is also important that every `update!` step receives a newly gradient computed gradient,
-as this will be change whenever the model's parameters are changed, and for each new data point.
+It is also important that every `update!` step receives a newly computed gradient,
+as it will change whenever the model's parameters are changed, and for each new data point.
 
 !!! compat "Implicit gradients"
     Flux ≤ 0.14 used Zygote's "implicit" mode, in which `gradient` takes a zero-argument function.