Package smile.clustering
Class SpectralClustering
java.lang.Object
smile.clustering.SpectralClustering
Spectral Clustering. Given a set of data points, the similarity matrix may
be defined as a matrix S where Sij represents a measure of the
similarity between points. Spectral clustering techniques make use of the
spectrum of the similarity matrix of the data to perform dimensionality
reduction for clustering in fewer dimensions. Then the clustering will
be performed in the dimension-reduce space, in which clusters of non-convex
shape may become tight. There are some intriguing similarities between
spectral clustering methods and kernel PCA, which has been empirically
observed to perform clustering.
References
- A.Y. Ng, M.I. Jordan, and Y. Weiss. On Spectral Clustering: Analysis and an algorithm. NIPS, 2001.
- Marina Maila and Jianbo Shi. Learning segmentation by random walks. NIPS, 2000.
- Deepak Verma and Marina Meila. A Comparison of Spectral Clustering Algorithms. 2003.
- Kai Zhang, Nathan R. Zemke, Ethan J. Armand and Bing Ren. A fast, scalable and versatile tool for analysis of single-cell omics data. 2024.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final record
Spectral clustering hyperparameters. -
Method Summary
Modifier and TypeMethodDescriptionstatic double[][]
embed
(double[][] data, int d, double sigma) Returns the embedding for spectral clustering.static double[][]
Returns the embedding for spectral clustering.static double[][]
embed
(SparseIntArray[] data, int p, int d) Returns the embedding for the nonnegative count data with cosine similarity.static CentroidClustering
<double[], double[]> fit
(double[][] data, SpectralClustering.Options options) Spectral clustering the data.static CentroidClustering
<double[], double[]> fit
(SparseIntArray[] data, int p, Clustering.Options options) Spectral clustering the nonnegative count data with cosine similarity.static CentroidClustering
<double[], double[]> nystrom
(double[][] data, SpectralClustering.Options options) Spectral clustering with Nystrom approximation.
-
Method Details
-
fit
public static CentroidClustering<double[],double[]> fit(SparseIntArray[] data, int p, Clustering.Options options) Spectral clustering the nonnegative count data with cosine similarity.- Parameters:
data
- the nonnegative count matrix.p
- the number of features.options
- the hyperparameters.- Returns:
- the model.
-
fit
public static CentroidClustering<double[],double[]> fit(double[][] data, SpectralClustering.Options options) Spectral clustering the data.- Parameters:
data
- the input data of which each row is an observation.options
- the hyperparameters.- Returns:
- the model.
-
nystrom
public static CentroidClustering<double[],double[]> nystrom(double[][] data, SpectralClustering.Options options) Spectral clustering with Nystrom approximation.- Parameters:
data
- the input data of which each row is an observation.options
- the hyperparameters.- Returns:
- the model.
-
embed
Returns the embedding for spectral clustering.- Parameters:
W
- the adjacency matrix of graph, which will be modified.d
- the dimension of feature space.- Returns:
- the embedding.
-
embed
public static double[][] embed(double[][] data, int d, double sigma) Returns the embedding for spectral clustering.- Parameters:
data
- the input data of which each row is an observation.d
- the dimension of feature space.sigma
- the smooth/width parameter of Gaussian kernel.- Returns:
- the embedding.
-
embed
Returns the embedding for the nonnegative count data with cosine similarity.- Parameters:
data
- the nonnegative count matrix.p
- the number of features.d
- the dimension of feature space.- Returns:
- the embedding.
-