-
Ball k-means algorithms is described in detail in https://ieeexplore.ieee.org/document/9139397.
-
the implementation of the ball k-means algorithm of the C++ version can be found in the "C++Version" file.
-
the implementation of the ball k-means algorithm of the Python version can be found in the "PythonVersion" file.
-
All data used in the paper is in the compressed file "data+centers(1).zip".
-
the implementations of the ball k-means algorithm are "ball_k_means_Xf.cpp"/"ball_k_means_Xf.py" and "ball_k_means_Xd.cpp"/"ball_k_means_Xd.py", which are code for "float" and "double" versions respectively.
-
the param "isRing" is used to switch the ring version and the no ring version of the algorithm.
-
According to our experience, the "Xd" version can get more accurate results but the running time is slightly slower than "Xf"; the "Xf" version can reach the fastest running time, but low accuracy may result in many decimal places of data .
-
C++ compiler supporting C++11
-
Linux operating system or Windows operating system
-
Eigen 3 template library
-
BLAS implementation, we recommend this one: http://www.openblas.net/
-
Intel MKL implementation, we recommend this one: https://software.intel.com/en-us/mkl
- Only need to rely on the DLL files in the "PythonVersion" file.
-
Eigen 3: In order to use Eigen, you just need to download and extract Eigen's source code: http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
-
ball_k_means_noRingVersion.cpp and ball_k_means_RingVersion.cpp both can be executed directly, only need to import Eigen library.
-
dataset: clustering data in Matrix format in the Eigen library.
-
centroids: initial center point data in matrix format in the Eigen library.
-
isRing: bool type, optional parameters, switch the ring version and the no ring version of the algorithm. "true" means the current algorithm is a ring version, and "false" means the current algorithm is no ring version. The default is false.
-
detail: bool type, optional parameters, "true" means output detailed information (including k value, distance calculation times, time, etc.), "false" means no detailed information is output. The default is false.
- labels: labels of clustering data in matrix format in the Eigen library.
-
isRing: bool type, optional parameters, switch the ring version and the no ring version of the algorithm. "true" means the current algorithm is a ring version, and "false" means the current algorithm is no ring version. The default is false.
-
detail: bool type, optional parameters, "true" means output detailed information (including k value, distance calculation times, time, etc.), "false" means no detailed information is output. The default is false.
-
dataset: absolute path of th csv file of clustering data.
-
centroids: absolute path of th csv file of initial center point data.
- labels: labels of clustering data in numpy matrix format.
- Please contact Yong Zheng at [email protected]