smile.manifold.UMAP

public class UMAP extends Object

Uniform Manifold Approximation and Projection. UMAP is a dimension reduction technique that can be used for visualization similarly to t-SNE, but also for general non-linear dimension reduction. The algorithm is founded on three assumptions about the data:

The data is uniformly distributed on a Riemannian manifold;
The Riemannian metric is locally constant (or can be approximated as such);
The manifold is locally connected.

From these assumptions it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

References

McInnes, L, Healy, J, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, ArXiv e-prints 1802.03426, 2018
How UMAP Works

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final record

UMAP.Options

The UMAP hyperparameters.
Method Summary

Modifier and Type

Method

Description

static double[][]

fit(double[][] data, UMAP.Options options)

Runs the UMAP algorithm with Euclidean distance.

static <T> double[][]

fit(T[] data, NearestNeighborGraph nng, UMAP.Options options)

Runs the UMAP algorithm.

static <T> double[][]

fit(T[] data, Metric<T> distance, UMAP.Options options)

Runs the UMAP algorithm.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- fit
  
  public static double[][] fit(double[][] data, UMAP.Options options)
  
  Runs the UMAP algorithm with Euclidean distance.
  
  Parameters:
  
  data - the input data.
  
  options - the hyperparameters.
  
  Returns:
  
  The embedding coordinates.
- fit
  
  public static <T> double[][] fit(T[] data, Metric<T> distance, UMAP.Options options)
  
  Runs the UMAP algorithm.
  
  Type Parameters:
  
  T - The data type of points.
  
  Parameters:
  
  data - the input data.
  
  distance - the distance function.
  
  options - the hyperparameters.
  
  Returns:
  
  The embedding coordinates.
- fit
  
  public static <T> double[][] fit(T[] data, NearestNeighborGraph nng, UMAP.Options options)
  
  Runs the UMAP algorithm.
  
  Type Parameters:
  
  T - the data type of points.
  
  Parameters:
  
  data - the input data.
  
  nng - the k-nearest neighbor graph.
  
  options - the hyperparameters.
  
  Returns:
  
  the embedding coordinates.

Class UMAP

References

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Details

fit

fit

fit