smile.clustering.DENCLUE

All Implemented Interfaces:: Serializable

public class DENCLUE extends Partitioning

DENsity CLUstering. The DENCLUE algorithm employs a cluster model based on kernel density estimation. A cluster is defined by a local maximum of the estimated density function. Observations going to the same local maximum are put into the same cluster.

Clearly, DENCLUE doesn't work on data with uniform distribution. In high dimensional space, the data always look like uniformly distributed because of the curse of dimensionality. Therefore, DENCLUDE doesn't work well on high-dimensional data in general.

References

A. Hinneburg and D. A. Keim. A general approach to clustering in large databases with noise. Knowledge and Information Systems, 5(4):387-415, 2003.
Alexander Hinneburg and Hans-Henning Gabriel. DENCLUE 2.0: Fast Clustering based on Kernel Density Estimation. IDA, 2007.

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final record

DENCLUE.Options

DENCLUE hyperparameters.
Constructor Summary

Constructors

Constructor

Description

DENCLUE(int k, int[] group, double[][] attractors, double[] radius, double[][] samples, double sigma, double tol)

Constructor.
Method Summary

Modifier and Type

Method

Description

double[][]

attractors()

Returns the density attractor of each data point.

static DENCLUE

fit(double[][] data, double sigma, int m)

Clustering data.

static DENCLUE

fit(double[][] data, DENCLUE.Options options)

Clustering data.

int

predict(double[] x)

Classifies a new observation.

double[]

radius()

Returns the radius of density attractor.

double

sigma()

Returns the smooth parameter in the Gaussian kernel.

double

tolerance()

Returns the tolerance of hill-climbing procedure.

Methods inherited from class smile.clustering.Partitioning
group, group, k, size, size, toString

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Constructor Details
- DENCLUE
  
  public DENCLUE(int k, int[] group, double[][] attractors, double[] radius, double[][] samples, double sigma, double tol)
  
  Constructor.
  
  Parameters:
  
  k - the number of clusters.
  
  group - the cluster labels.
  
  attractors - the density attractor of each data point.
  
  radius - the radius of density attractor.
  
  samples - the samples in the iterations of hill climbing.
  
  sigma - the smooth parameter in the Gaussian kernel. The user can choose sigma such that number of density attractors is constant for a long interval of sigma.
  
  tol - the tolerance of hill-climbing procedure.
Method Details
- sigma
  
  public double sigma()
  
  Returns the smooth parameter in the Gaussian kernel.
  
  Returns:
  
  the smooth parameter in the Gaussian kernel.
- tolerance
  
  public double tolerance()
  
  Returns the tolerance of hill-climbing procedure.
  
  Returns:
  
  the tolerance of hill-climbing procedure.
- radius
  
  public double[] radius()
  
  Returns the radius of density attractor.
  
  Returns:
  
  the radius of density attractor.
- attractors
  
  public double[][] attractors()
  
  Returns the density attractor of each data point.
  
  Returns:
  
  the density attractor of each data point.
- fit
  
  public static DENCLUE fit(double[][] data, double sigma, int m)
  
  Clustering data.
  
  Parameters:
  
  data - the input data of which each row is an observation.
  
  sigma - the smooth parameter in the Gaussian kernel. The user can choose sigma such that number of density attractors is constant for a long interval of sigma.
  
  m - the number of selected samples used in the iteration. This number should be much smaller than the number of observations to speed up the algorithm. It should also be large enough to capture the sufficient information of underlying distribution.
  
  Returns:
  
  the model.
- fit
  
  public static DENCLUE fit(double[][] data, DENCLUE.Options options)
  
  Clustering data.
  
  Parameters:
  
  data - the input data of which each row is an observation.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.
- predict
  
  public int predict(double[] x)
  
  Classifies a new observation.
  
  Parameters:
  
  x - a new observation.
  
  Returns:
  
  the cluster label. Note that it may be Clustering.OUTLIER.

Class DENCLUE

References

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class smile.clustering.Partitioning

Methods inherited from class java.lang.Object

Constructor Details

DENCLUE

Method Details

sigma

tolerance

radius

attractors

fit

fit

predict