smile-kotlin/smile.clustering/gmeans

gmeans

fun gmeans(data: Array<DoubleArray>, k: Int = 100, maxIter: Int = 100): CentroidClustering<DoubleArray, DoubleArray>(source)

G-Means clustering algorithm, an extended K-Means which tries to automatically determine the number of clusters by normality test. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a hierarchical fashion until the test accepts the hypothesis that the data assigned to each k-means center are Gaussian.

====References:====

G. Hamerly and C. Elkan. Learning the k in k-means. NIPS, 2003.

Parameters

data

the data set.

the maximum number of clusters.

maxIter

the maximum number of iterations for k-means.