blue-yonder · nils-braun · Jul 17, 2024 · Jul 7, 2024
diff --git a/notebooks/02 sklearn Pipeline.ipynb b/notebooks/02 sklearn Pipeline.ipynb
@@ -116,9 +116,9 @@
     "    \n",
     "Here comes the tricky part!\n",
     "    \n",
-    "The input to the pipeline will be our dataframe `X`, which one row per identifier.\n",
+    "The input to the pipeline will be our dataframe `X`, with one row per identifier.\n",
     "It is currently empty.\n",
-    "But which time series data should the `RelevantFeatureAugmenter` to actually extract the features from?\n",
+    "But which time series data should the `RelevantFeatureAugmenter` use to actually extract the features from?\n",
     "\n",
     "We need to pass the time series data (stored in `df_ts`) to the transformer.\n",
     "    \n",
@@ -179,7 +179,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "During interference, the augmentor does only extract the relevant features it has found out in the training phase and the classifier predicts the target using these features."
+    "During inference, the augmenter only extracts those features that it has found as being relevant in the training phase. The classifier predicts the target using these features."
    ]
   },
   {
@@ -211,7 +211,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "You can also find out, which columns the augmenter has selected"
+    "You can also find out which columns the augmenter has selected"
    ]
   },
   {
@@ -248,11 +248,11 @@
    "metadata": {},
    "source": [
     "In the example above we passed in a single `df_ts` into the `RelevantFeatureAugmenter`, which was used both for training and predicting.\n",
-    "During training, only the data with the `id`s from `X_train` where extracted and during prediction the rest.\n",
+    "During training, only the data with the `id`s from `X_train` were extracted. The rest of the data are extracted during prediction.\n",
     "\n",
     "However, it is perfectly fine to call `set_params` twice: once before training and once before prediction. \n",
     "This can be handy if you for example dump the trained pipeline to disk and re-use it only later for prediction.\n",
-    "You only need to make sure that the `id`s of the enteties you use during training/prediction are actually present in the passed time series data."
+    "You only need to make sure that the `id`s of the entities you use during training/prediction are actually present in the passed time series data."
    ]
   },
   {