diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index 9f2795430..38b87540d 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -125,9 +125,20 @@ It would also work with pandas dataframe::
   >>> df_resampled, y_resampled = rus.fit_resample(df_adult, y_adult)
   >>> df_resampled.head()  # doctest: +SKIP
 
-:class:`NearMiss` adds some heuristic rules to select samples
-:cite:`mani2003knn`. :class:`NearMiss` implements 3 different types of
-heuristic which can be selected with the parameter ``version``::
+NearMiss
+^^^^^^^^
+
+:class:`NearMiss` is another controlled under-sampling technique. It aims to balance
+the class distribution by eliminating samples from the targeted classes. But these
+samples are not removed at random. Instead, :class:`NearMiss` removes instances of the
+target class(es) that increase the "space" or separation between the target class and
+the minority class. In other words, :class:`NearMiss` removes observations from the
+target class that are closer to the boundary they form with the minority class samples.
+
+To find out which samples are closer to the boundary with the minority class,
+:class:`NearMiss` uses the K-Nearest Neighbours algorithm. :class:`NearMiss` implements
+3 different heuristics, which we can select with the parameter ``version`` and we
+will explain in the coming paragraphs. We can perform this undersampling as follows::
 
   >>> from imblearn.under_sampling import NearMiss
   >>> nm1 = NearMiss(version=1)
@@ -135,65 +146,75 @@ heuristic which can be selected with the parameter ``version``::
   >>> print(sorted(Counter(y_resampled).items()))
   [(0, 64), (1, 64), (2, 64)]
 
-As later stated in the next section, :class:`NearMiss` heuristic rules are
-based on nearest neighbors algorithm. Therefore, the parameters ``n_neighbors``
-and ``n_neighbors_ver3`` accept classifier derived from ``KNeighborsMixin``
-from scikit-learn. The former parameter is used to compute the average distance
-to the neighbors while the latter is used for the pre-selection of the samples
-of interest.
 
 Mathematical formulation
-^^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+:class:`NearMiss` uses the K-Nearest Neighbours algorithm to identify the samples of the
+target class(es) that are closer to the minority class, as well as the distance that
+separates them.
 
-Let *positive samples* be the samples belonging to the targeted class to be
-under-sampled. *Negative sample* refers to the samples from the minority class
-(i.e., the most under-represented class).
+Let *positive samples* be the samples from the class to be under-sampled, and
+*negative sample* the samples from the minority class (i.e., the most
+under-represented class).
 
-NearMiss-1 selects the positive samples for which the average distance
-to the :math:`N` closest samples of the negative class is the smallest.
+**NearMiss-1** selects the positive samples whose average distance to the :math:`K`
+closest samples of the negative class is the smallest (:math:`K` is the number of
+neighbours in the K-Nearest Neighbour algorithm). The following image illustrates the
+logic:
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_illustration_nearmiss_001.png
    :target: ./auto_examples/under-sampling/plot_illustration_nearmiss.html
    :scale: 60
    :align: center
 
-NearMiss-2 selects the positive samples for which the average distance to the
-:math:`N` farthest samples of the negative class is the smallest.
+**NearMiss-2** selects the positive samples whose average distance to the
+:math:`K` farthest samples of the negative class is the smallest. The following image
+illustrates the logic:
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_illustration_nearmiss_002.png
    :target: ./auto_examples/under-sampling/plot_illustration_nearmiss.html
    :scale: 60
    :align: center
 
-NearMiss-3 is a 2-steps algorithm. First, for each negative sample, their
-:math:`M` nearest-neighbors will be kept. Then, the positive samples selected
-are the one for which the average distance to the :math:`N` nearest-neighbors
-is the largest.
+**NearMiss-3** is a 2-steps algorithm:
+
+First, for each negative sample, that is, for each observation of the minority class,
+it selects :math:`M` nearest-neighbors from the postivie class (target class). This
+ensures that all observations from the minority class have at least some neighbours
+from the target class.
+
+Next, it selects positive samples whose average distance to the :math:`K`
+nearest-neighbors of the minority class is the largest.
+
+The following image illustrates the logic:
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_illustration_nearmiss_003.png
    :target: ./auto_examples/under-sampling/plot_illustration_nearmiss.html
    :scale: 60
    :align: center
 
-In the next example, the different :class:`NearMiss` variant are applied on the
-previous toy example. It can be seen that the decision functions obtained in
-each case are different.
-
-When under-sampling a specific class, NearMiss-1 can be altered by the presence
-of noise. In fact, it will implied that samples of the targeted class will be
-selected around these samples as it is the case in the illustration below for
-the yellow class. However, in the normal case, samples next to the boundaries
-will be selected. NearMiss-2 will not have this effect since it does not focus
-on the nearest samples but rather on the farthest samples. We can imagine that
-the presence of noise can also altered the sampling mainly in the presence of
-marginal outliers. NearMiss-3 is probably the version which will be less
-affected by noise due to the first step sample selection.
+In the following example, we apply the different :class:`NearMiss` variants to a toy
+dataset. Note how the decision functions obtained in each case are different (left
+plots):
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_comparison_under_sampling_003.png
    :target: ./auto_examples/under-sampling/plot_comparison_under_sampling.html
    :scale: 60
    :align: center
 
+NearMiss-1 is sensitive to noise. In fact, we could think that those observations from
+the target class that are closer to samples from the minority class are indeed noise.
+NearMiss-1 will select however, those observations, as shown in the first row of the
+previous illustration (check the yellow class).
+
+NearMiss-2 will be less sensitive to noise since it does not select the nearest, but
+rather on the farthest samples of the target class.
+
+NearMiss-3 is probably the least sensitive version to noise due to the first sample
+selection step.
+
+
 Cleaning under-sampling techniques
 ----------------------------------
 
diff --git a/imblearn/under_sampling/_prototype_selection/_nearmiss.py b/imblearn/under_sampling/_prototype_selection/_nearmiss.py
index 70f647fa5..7073a8cf2 100644
--- a/imblearn/under_sampling/_prototype_selection/_nearmiss.py
+++ b/imblearn/under_sampling/_prototype_selection/_nearmiss.py
@@ -35,20 +35,17 @@ class NearMiss(BaseUnderSampler):
 
     n_neighbors : int or estimator object, default=3
         If ``int``, size of the neighbourhood to consider to compute the
-        average distance to the minority point samples.  If object, an
+        average distance to the minority samples.  If object, an
         estimator that inherits from
         :class:`~sklearn.neighbors.base.KNeighborsMixin` that will be used to
         find the k_neighbors.
-        By default, it will be a 3-NN.
 
     n_neighbors_ver3 : int or estimator object, default=3
-        If ``int``, NearMiss-3 algorithm start by a phase of re-sampling. This
-        parameter correspond to the number of neighbours selected create the
-        subset in which the selection will be performed.  If object, an
-        estimator that inherits from
+        Only used if `version=3`. If ``int``, the number of target class samples that
+        are closest to a minority sample that will be retained in the first subsampling
+        step. If object, an estimator that inherits from
         :class:`~sklearn.neighbors.base.KNeighborsMixin` that will be used to
         find the k_neighbors.
-        By default, it will be a 3-NN.
 
     {n_jobs}
 
@@ -56,7 +53,7 @@ class NearMiss(BaseUnderSampler):
     ----------
     sampling_strategy_ : dict
         Dictionary containing the information to sample the dataset. The keys
-        corresponds to the class labels from which to sample and the values
+        correspond to the class labels from which to sample and the values
         are the number of samples to sample.
 
     nn_ : estimator object
@@ -144,7 +141,7 @@ def __init__(
     def _selection_dist_based(
         self, X, y, dist_vec, num_samples, key, sel_strategy="nearest"
     ):
-        """Select the appropriate samples depending of the strategy selected.
+        """Select the appropriate samples depending on the selected strategy.
 
         Parameters
         ----------